"Map" of Archival File System, University of Alabama Libraries Digital Services -------------------------------------------------------------------------------------------------------------- Top level of archive located at 2nd level of file system structure, within the srv directory: /srv/archive/ Via web access, the top level of the archive is located at http://libcontent1.lib.ua.edu/lockss/ At this level, there exist .htaccess files which restrict access by IP (Internet Protocol) address to the web directory. There is also a Manifest.html file which provides, in xhtml 1.0, a web entry point to the links of primary archival content, for access to LOCKSS (Lots Of Copies Keep Stuff Safe) web crawlers. This Manifest file links to other Manifest files at the next lower level in the file hierarchy, which represents type of material and holding area (provenance). The remaining files in the top level of the archive are directories (access points to the next lower level refereneced in the Manifest file). One of these directories is special: that directory is named a0000 to indicate that it is the first directory of interest for re-constitution of our content. This directory contains information about this entire archive (including this file) and open-source software which can be used to access our archival content, including a simple operating system. Apart from this directory, all the others contain digitized material described by the ------- file located -------. Within each provenance/type directory are subdirectories, each of which represents a collection of material belonging to that holding area. Each collection directory contains subdirectories for items within its collection. Each item directory contains subdirectories for subpages for that item, if applicable. A subpage may also be a sub-sequence for delivery. Each subpage directory contains subdirectories for sub-sub pages for that item, if applicable. A sub-sub page may also be a sub-sub sequence for delivery. Here is a set of examples: u0003 contains digitized Hoole Special Collections manuscript materials within u0003, 0000001 contains digitized content from the Hoole Manuscript collection numbered MS 0001, the Edwin A. Abbott Papers collection. within 0000001, 0000055 contains the files for the 55th item digitized, a document which happens to have 3 pages. within 0000055, 0003 contains the files for the 3rd page of this document, named (appropriately), u0003_0000001_0000055_0003.tif to reflect the location in the file structure where it belongs. A simple diagram would be: u0003/ 0000001/ 0000055/ 0003/ u0003_0000001_0000055_0003.tif Information about content is stored at the level at which it applies. That is, each directory is allowed to have the following three additional subdirectories, for explanatory material: 1) Documentation: contains administrative information and Manifest files 2) Metadata: contains formalized metadata (descriptive and other information) in standardized schemes 3) Transcripts: contains either images of transcriptions made from archival content, or text files of transcribed content. If the extension is .ocr.txt, this has been created by an optical character recognition process and has not been humanly repaired. Thus, the u0003 directory contains a Documentation subdirectory which has within it: 1) Manifest.html which is the set of links for all the collections within this provenance area, for LOCKSS web crawlers 2) an xml file containing basic administrative metadata about this holding area and type of material, named for this directory area: u0003.xml for Hoole Special Collections manuscript materials And the u0003 directory may also contains Metadata subdirectory with formalized metadata about Hoole Special collections manuscript materials. In addition, the collection subdirectory 0000001 will contain a Metadata directory for information about this collection, the Edwin A. Abbott Papers; and a Documentation directory for the administrative information about the digitization and management of this collection, as well as a Manifest file for links to all the archival content in the collection. Beneath the subdirectory 0000001, the subdirectory 0000055 may contain a Metadata directory for information about this particular item, may also contain a Documentation directory, and may contain a Transcripts directory. The same is true for all content subdirectories.