####################### # Software Dependencies ####################### #Linux server is being used: http://www.linux.org/ #Perl: http://www.perl.org/ 5.6 or above, including packages File::Copy, Shell::Command, Time::Local, XML::Twig # (filepath to perl is set on top line of each script; if not in /usr/bin/perl, please modify to match # correct path to perl executable; locate by typing "which perl" on commandline) #ImageMagick: http://www.imagemagick.org/script/index.php (must be on classpath of script user; # locate by typing "which convert" on commandline) #Tesseract OCR: http://sourceforge.net/projects/tesseract-ocr/ (must be on classpath of script user; # locate by typing "which tesseract" on commandline) #If using static HTML delivery for content, the server recognizes "index.html" as a default page # in web directories. This is a common default set in the # configuration file for the apache web server. ############ # EADs ############ # EADs are created in Archivist Toolkit or are in the same format # namespace at top of EAD includes an ns2 prefix with namespace for xlink, as well as the usual ones for validation. # So the top element reads, for example: # # Containers in EADs include Box, Folder, and Item where a Box may or may not contain one or more folders # and a folder may or may not contain one or more items. Items do not occur outside of folders, and # folders do not occur outside of boxes. # Frame will be accepted as an alternatives to Folder; Volume will be accepted as an alternative for # either Box or Folder (but not both in a single container). # Levels in the EAD include series and subseries, which contain file levels. Within the file level the # Box and Folder are normally entered as containers. Item level entries (not made by this script) # within the file level is NOT supported, # as there is no way to match up these entered item descriptions with the digitized content. # All linking will be done at file level! So at minimum, this script expects file level to be used where # boxes and/or folders are entered as containers. Thus, if you have a single series, you must enter at least one # box at file level for the folders to be linked in. Without identification of the box, at least, there's no linking. # Also, if folder-level linking is desired (collection is only processed to the box level) # then the digitized files must have a folder number in their filename -- if no folders # actually exist in the box, use folder number 1 in the digital file naming. # Note: folder-level linking is not currently supported in Acumen. That is to say, the EAD in Acumen would still # link out to the folder, but the folder and its items would be displayed via the static HTML at this point. # EADs will be delivered online via Acumen or some other delivery software. If not, and simple HTML output # for each EAD is desired, a sample XSL file is provided in the /EADS/XSL directory. # Modify as desired and apply to each EAD using the transformation system of your choice, to create an # HTML page for each EAD. Name the resultant HTML for the collection identifier (with .html extension) and place # in the EADs folder for online upload. One possible processor is Saxon: http://saxon.sourceforge.net/ # The file will be renamed index.html and placed in the collection web directory http://yoursite/collID # where collID is your selected collection identifier. # Note that most browsers will # transform an xml file based on the rules in the XSL it is linked to. Thus, if you modify an xsl file to your # liking, name it ead.xsl, place it in the same directory as the EAD, and link it at the top like this: # (just below the "