EADs

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(New page: Archival staff are almost finished with revamping old Word and PDF finding aids into EAD (Encoded Archival Description) finding aids. Thus this diagram shows the creating of the EAD dur...)
 
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Archival staff are almost finished with revamping old Word and PDF finding aids into EAD (Encoded Archival Description) finding aids.    Thus this diagram shows the creating of the EAD during processing.  The archivists keep their copy in Archivists Toolkit, where it was created.  If remediated, or links are added, the newest copy is kept there, and Jody is notified to pick up new or altered EADs from a folder on the Share drive. When she does, she leaves a copy of each in a nearby folder dated for time of pickup.
+
The diagram below shows the creation of the EAD in Archivists Toolkit (AT) during processing.  The archivists consider the copy in Archivists Toolkit to be the copy of record; however, we have found that reloading EADs containing links to component items modifies the links, and reexporting modifies them further.  So we have altered our workflow. While the EAD in AT is the copy of record for analog material, the delivery EAD (containing the links to digitized content) will be stored separately.  Every time EADs are modified by the archivists, they must go through the item-level linking process again.
  
Some discussion of the linking from EADs, instructions, and implementation, can be found in [[Linking_out_from_EADS]]. Apart from the work developed for the Cabaniss project [[Cabaniss]], Jody has not yet developed the software to locate digitized content and link it into the proper EAD in the proper place.  We'll have to work out how to determine the proper location from the metadata (there must be a box and folder number locatable in the exported metadata spreadsheet for each item, or in the generated MODS for each item.
+
[[Image:EAD3.png]]
  
Prior to that, Jody will develop the methodology and scripts for processing EADs for the archive.
 
  
[[image:EAD.png]]
+
 
 +
1) EADs are exported by archivists from Archivists Toolkit and placed in the "new" or "remediated" folders in the share drive S:\Special Collections\Digital_Program_files\EAD directory.
 +
 
 +
 
 +
2) Every Friday night, a script called "getEADs" ( /in /srv/scripts/storing/EADs_auto/) picks up these EADs, makes a datestamped directory in the "uploaded" directory there on the share drive  (for example, "uploaded_new_20100803"), copys the EADs to the corresponding "uploaded" directory (so the archivists will know what was picked up when), puts a copy into /srv/deposits/fromSC/, and then places them in an "notInDbase" directory on libcontent (under /srv/deposits/EADs/). 
 +
 
 +
3) When content is digitized, the EAD may already be online.  So the most recent makeJpegs upload scripts pull a copy of the corresponding EAD (if it exists) and places it in the /srv/script/eads/LINKME/ directory. 
 +
 
 +
* EadModsTester checks in the /srv/script/eads/LINKME/ directory for EADs corresponding to newly uploaded digitized content.  If found, and there's not a new version of it waiting to be linked, it adds this to those in /srv/deposits/EADs/notInDbase/.
 +
* Many EADs come to us with embedded Word or PDF encodings, and some contain ampersands.  These need to be corrected before they go online. The next script, "EadModsTester" (in /srv/scripts/eads/) runs through these new EADs to replace these values, and puts the new version into /srv/deposits/EADs/unlinkedASCII/ to be used for linking.
 +
* Then it parses the EAD to locate box and folder numbers and itemIDs (all with corresponding ref values), and hunts through the web directories for MODS with item identifiers that correspond to the itemIDs, or for MODS within this collection (pulling box and folder values).
 +
* This script then outputs a tab-delimited text file for each collection into which it can identify linkable content, into /srv/scripts/eads/linkrefs/ ; this file specifies the reference identifier in the EAD, the existing EAD box_folder value, the normalized EAD box_folder, the existing MODS box_folder value, the normalized MODS box_folder, and the MODS identifier if a match exists for an itemID value. 
 +
* This script also outputs a few other files, such as problems found in EADs (see /srv/scripts/eads/output/).
 +
 
 +
More discussion of the linking from EADs can be found in [[Linking_out_from_EADS]].
 +
 +
 
 +
4) "linkInContent" uses the linkrefs files created by eadModsTester, and while referencing the EADs in /srv/deposits/EADs/notInDbase/, it pulls from /srv/deposits/EADs/unlinkedASCII/.  If MODS box/folder values are indicated, this script goes through the Acumen directories to locate the corresponding items (otherwise, utilizes the item identifier if in the last column). It creates PURL links for each of these, parsing through the EAD to add them in the correct location, including adding boxes and folders if necessary.  The linked version is placed in /srv/scripts/eads/LINKED/ ;  if this process is successful, it is then written back to /srv/deposits/EADs/notInDbase, overwriting the deposited version.  If no links are added, the version in the unlinkedASCII folder is written to the notInDbase folder, again overwriting the deposited version.
 +
 +
5)  "EadsToDbase" pulls from /srv/deposits/EADs/notInDbase/, updates the database (including replacing changed title and abstract);  the values in the database appear in the online collection list ([http://www.lib.ua.edu/digital/browse| collection list]), so it also puts the EADs live online in Acumen, and moves the copy from notInDbase to /srv/deposits/EADs/new/  for archiving.
 +
 
 +
 
 +
 
 +
'''Archiving'''
 +
 
 +
6) "waitCheckEADs" checks to see if the EAD has changed from the last version cached.  If not, it is deleted from the deposits directory.  If so, it checks to see if this collection has been released into LOCKSS (and on what date).  If it has, the script asks if you are going to go ahead and archive;  if you say yes,
 +
the script will copy the existing manifest to one ending in "_LOCKSS_$date" where $date is today's date.  We need this because LOCKSS collects each  version of manifest, and we need to know how many bytes we have in the preservation architecture, as it impacts our costs.  Try NOT to archive to a collection frequently, or within 2-3 weeks of release to LOCKSS.
 +
 
 +
7) Remove RelocateManifests. Uncomment $test = 1 in relocatingEads and run it.
 +
 
 +
8) Check moveme & relocateManifests to verify that the manifests will be written correctly, and verify in move me that the Eads are going to be copied over to the correct place.  Be sure to look at end of relocateManifests for Manifests that need to be created by hand (for new holding areas), as well as to check what is being added to existing holder Manifests.
 +
 
 +
 
 +
9) Comment back in $test = 1; and re-run relocatingEads."relocatingEads" pulls from /srv/deposits/EADs/new and locates where the EADs go in the archive, versioning as necessary and linking them into existing LOCKSS manifests, or creating new ones as needed.
 +
 
 +
9.5)  Check parentMans output for any manifests that need to be created for new holding areas.  If any are identified, create them to match the format of all other holding area Manifests, and insert the links captured for you in parentMans. 
 +
 
 +
10) Run Checkem to verify that they have been copied over correctly and deleting the Eads in the deposits.
 +
 
 +
11) Check directory to make sure nothing is left /srv/deposits/EADs/new. If anything is left in this location, there is a problem and you need to figure out what it is.
 +
 
 +
 
 +
 
 +
[[User:Jlderidder|Jlderidder]] ([[User talk:Jlderidder|talk]]) 10:12, 19 December 2013 (CST)
 +
 
 +
For reference, from the archivists:
 +
 
 +
[[Image: containers.jpg]]
 +
 
 +
[[http://www.lib.ua.edu/wiki/digcoll/images/e/ed/Containers.docx  Text list of containers]]

Latest revision as of 14:55, 10 January 2014

The diagram below shows the creation of the EAD in Archivists Toolkit (AT) during processing. The archivists consider the copy in Archivists Toolkit to be the copy of record; however, we have found that reloading EADs containing links to component items modifies the links, and reexporting modifies them further. So we have altered our workflow. While the EAD in AT is the copy of record for analog material, the delivery EAD (containing the links to digitized content) will be stored separately. Every time EADs are modified by the archivists, they must go through the item-level linking process again.

EAD3.png


1) EADs are exported by archivists from Archivists Toolkit and placed in the "new" or "remediated" folders in the share drive S:\Special Collections\Digital_Program_files\EAD directory.


2) Every Friday night, a script called "getEADs" ( /in /srv/scripts/storing/EADs_auto/) picks up these EADs, makes a datestamped directory in the "uploaded" directory there on the share drive (for example, "uploaded_new_20100803"), copys the EADs to the corresponding "uploaded" directory (so the archivists will know what was picked up when), puts a copy into /srv/deposits/fromSC/, and then places them in an "notInDbase" directory on libcontent (under /srv/deposits/EADs/).

3) When content is digitized, the EAD may already be online. So the most recent makeJpegs upload scripts pull a copy of the corresponding EAD (if it exists) and places it in the /srv/script/eads/LINKME/ directory.

  • EadModsTester checks in the /srv/script/eads/LINKME/ directory for EADs corresponding to newly uploaded digitized content. If found, and there's not a new version of it waiting to be linked, it adds this to those in /srv/deposits/EADs/notInDbase/.
  • Many EADs come to us with embedded Word or PDF encodings, and some contain ampersands. These need to be corrected before they go online. The next script, "EadModsTester" (in /srv/scripts/eads/) runs through these new EADs to replace these values, and puts the new version into /srv/deposits/EADs/unlinkedASCII/ to be used for linking.
  • Then it parses the EAD to locate box and folder numbers and itemIDs (all with corresponding ref values), and hunts through the web directories for MODS with item identifiers that correspond to the itemIDs, or for MODS within this collection (pulling box and folder values).
  • This script then outputs a tab-delimited text file for each collection into which it can identify linkable content, into /srv/scripts/eads/linkrefs/ ; this file specifies the reference identifier in the EAD, the existing EAD box_folder value, the normalized EAD box_folder, the existing MODS box_folder value, the normalized MODS box_folder, and the MODS identifier if a match exists for an itemID value.
  • This script also outputs a few other files, such as problems found in EADs (see /srv/scripts/eads/output/).

More discussion of the linking from EADs can be found in Linking_out_from_EADS.


4) "linkInContent" uses the linkrefs files created by eadModsTester, and while referencing the EADs in /srv/deposits/EADs/notInDbase/, it pulls from /srv/deposits/EADs/unlinkedASCII/. If MODS box/folder values are indicated, this script goes through the Acumen directories to locate the corresponding items (otherwise, utilizes the item identifier if in the last column). It creates PURL links for each of these, parsing through the EAD to add them in the correct location, including adding boxes and folders if necessary. The linked version is placed in /srv/scripts/eads/LINKED/ ; if this process is successful, it is then written back to /srv/deposits/EADs/notInDbase, overwriting the deposited version. If no links are added, the version in the unlinkedASCII folder is written to the notInDbase folder, again overwriting the deposited version.

5) "EadsToDbase" pulls from /srv/deposits/EADs/notInDbase/, updates the database (including replacing changed title and abstract); the values in the database appear in the online collection list (collection list), so it also puts the EADs live online in Acumen, and moves the copy from notInDbase to /srv/deposits/EADs/new/ for archiving.


Archiving

6) "waitCheckEADs" checks to see if the EAD has changed from the last version cached. If not, it is deleted from the deposits directory. If so, it checks to see if this collection has been released into LOCKSS (and on what date). If it has, the script asks if you are going to go ahead and archive; if you say yes, the script will copy the existing manifest to one ending in "_LOCKSS_$date" where $date is today's date. We need this because LOCKSS collects each version of manifest, and we need to know how many bytes we have in the preservation architecture, as it impacts our costs. Try NOT to archive to a collection frequently, or within 2-3 weeks of release to LOCKSS.

7) Remove RelocateManifests. Uncomment $test = 1 in relocatingEads and run it.

8) Check moveme & relocateManifests to verify that the manifests will be written correctly, and verify in move me that the Eads are going to be copied over to the correct place. Be sure to look at end of relocateManifests for Manifests that need to be created by hand (for new holding areas), as well as to check what is being added to existing holder Manifests.


9) Comment back in $test = 1; and re-run relocatingEads."relocatingEads" pulls from /srv/deposits/EADs/new and locates where the EADs go in the archive, versioning as necessary and linking them into existing LOCKSS manifests, or creating new ones as needed.

9.5) Check parentMans output for any manifests that need to be created for new holding areas. If any are identified, create them to match the format of all other holding area Manifests, and insert the links captured for you in parentMans.

10) Run Checkem to verify that they have been copied over correctly and deleting the Eads in the deposits.

11) Check directory to make sure nothing is left /srv/deposits/EADs/new. If anything is left in this location, there is a problem and you need to figure out what it is.


Jlderidder (talk) 10:12, 19 December 2013 (CST)

For reference, from the archivists:

Containers.jpg

[Text list of containers]

Personal tools