For Archiving

From UA Libraries Digital Services Planning and Documentation
Revision as of 11:52, 29 October 2010 by Scpstu32 (talk | contribs)

Currently, content being uploaded for archival storage is in a specific organization (specified here: Share_Drive_Protocols).

Once this content is placed into the /srv/deposits/content/ directory on libcontent1 (a Linux server), we :

  1. verify that it copied correctly across the network,
  2. check the content with quality control verification scripts (such as File:TestIncoming.txt)
  3. upload the collection information file content into the database to provide access to the online collection via a web-side php script, and
  4. then we archive it.

Archiving it means that we weed out extraneous files, re-order content (via copy) according to our storage organization (specified here: Organization_of_completed_content_for_long-term_storage), version the metadata, xml, or text files (linking into the manifest only the version; the updated one overwrites the unversioned copy in the directory) and either create a LOCKSS manifest for this content or alter existing ones to include this content.

This script (still being modified and updated to handle new problems) is here: File:Relocating.txt By uncommenting out the $test = 1; line, you can run this as a test, which will not change any existing manifests or copy content. Instead, it will write all the manifest changes and creations into one huge file called RelocateManfests, and it will still write a list of what files it will copy where to the "moveme" file.

After running this script for real, run "checkem" which goes through the moveme file, does md5 comparison on the old file and the new one -- if they're the same, it will delete the old on in the deposits directory. If they're not the same, it will output an error and leave the original untouched.

Here's the checkem script: File:Checkem.txt

Another handy script is archiveCheck File:CheckArchive.txt which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: File:SplitExcel.txt