Metadata Movement

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(Metadata Transfer and Remediation for Collections Requiring Multiple Uploads)
(47 intermediate revisions by one user not shown)
Line 4: Line 4:
 
==Digital Services:==
 
==Digital Services:==
 
When content is ready for storage:
 
When content is ready for storage:
# put the Excel file into S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation for Mary,
+
# Put the Excel file into S:\Digital Projects\Administrative\Pipeline\collectionInfo\forMDlib\needsRemediation for Mary.
# export a Unicode tab-delimited text file (use Open Office) and put it with the content for storage
+
# Export a tab-delimited text file and put it with the content for Storage.
# use a copy of this txt file to generate MODS according to the instructions here: [[Making MODS]]<!--[[Generating_MODS]]-->
+
# Export a tab-delimited text file to generate MODS according to the instructions here: [[Making MODS]]<!--[[Generating_MODS]]-->
# and upload the MODS to Acumen by following the process outlined here: [[Uploading_MODS]]
+
# Upload the MODS to Acumen by following the process outlined here: [[Uploading_MODS]]
# then delete the text file
+
# Delete the text file used to create the MODS.
 +
# Notify  Jody that new MODS await.
  
 
==Metadata Services:==
 
==Metadata Services:==
# Retrieve deposited spreadsheet from S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation, and remediate metadata appropriately
+
# Retrieve deposited spreadsheet from S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation, and remediate metadata appropriately.
# Save the remediated spreadsheet in S:\Digital Projects\Administrative\collectionInfo\Storage_Excel
+
# Move the remediated spreadsheet to S:\Digital Projects\Administrative\collectionInfo\Storage_Excel
 
# Generate MODS records, according to the instructions here: [[Making MODS]]<!--[[Generating_MODS]]-->
 
# Generate MODS records, according to the instructions here: [[Making MODS]]<!--[[Generating_MODS]]-->
 
# Upload these  MODS records to Acumen by following the process outlined here: [[Uploading_MODS]]
 
# Upload these  MODS records to Acumen by following the process outlined here: [[Uploading_MODS]]
# Deposit the exported Unicode tab-delimited collection-level metadata spreadsheet in S:\Digital Projects\Administrative\collectionInfo\JodyPickup   
+
# Export a tab-delimited text file of the remediated metadata.
# Notify  Jody it's ready for pickup
+
# Deposit the exported tab-delimited collection-level metadata in S:\Digital Projects\Administrative\collectionInfo\JodyPickup.  
 +
# Notify  Jody it's ready for pickup.
  
 
==Jody: Digital Services==
 
==Jody: Digital Services==
Line 25: Line 27:
  
 
==Metadata Transfer and Remediation for Collections Requiring Multiple Uploads==
 
==Metadata Transfer and Remediation for Collections Requiring Multiple Uploads==
# Metadata is originally held in a .xlsx file.
+
# When enough item scans have passed quality control checks to warrant an upload of materials to Storage (i.e. a "batch"), the metadata corresponding to the items to be uploaded is copied out of the original set of metadata and pasted into a new Excel file (.xlsx) by Digital Services. For instructions on how to prepare batch metadata see here: [[BATCH_Metadata]] The naming convention for this file is collectionNumber.n.xlsx, where "n" is the numerical value of the batch. Ex: u0008_0000001.2.xlsx = the metadata for the second batch of items for collection u0008_0000001. <br /><br />Digital Services will note - in the original collection metadata Excel file - which items correspond to this batch. This shall be done via entering the information into a column in the Excel version of the collection metadata. In this fashion, we keep track of which batch we're on. <br /><br />This column and it's cells must be excluded from any subsequent text files that are generated for MODS or Storage purposes.<br /><br />
# After Digital Services has ensured that the metadata received from the archivists has been formatted to the standard metadata format, this set of metadata is kept on the first worksheet of the workbook. The "original" metadata (i.e. received directly from the Archivists) may exist on a separate tab of the spreadsheet, should the need to refer to it arise.
+
# This subset of the collection metadata is then exported from Excel as a tab delimited .txt file and is placed in the appropriate collection metadata folder in s:\Digital Projects\Digital_Coll_Complete by Digital Services.
# When enough item scans have passed quality control checkls to warrant an upload of materials to Storage (i.e. a "batch"), the metadata corresponding to the items to be uploaded is copied out of the original set of metadata and pasted into a new Excel file (.xlsx). The naming convention for this file is collectionNumber.n.xlsx, where "n" is the numerical value of the batch. Ex: u0008_0000001.2.xlsx = the metadata for the second batch of items for collection u0008_0000001.
+
# The .xlsx version of this metadata subset is placed into S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation by Digital Services.  
# This subset of the collection metadata is then exported from Excel as a tab delimited Unicode .txt file and is placed in the appropriate folder in s:\Digital Projects\Digital_Coll_Complete.
+
# Metadata Services then applies the same steps as for single batch collections in terms of MODS generation and exporting a .txt files to delivery to Jody for Storage.
# The .xlsx version of this metadata subset is placed into S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation, and remediated appropriately.
+
# the metadata for remediation in the .xlsx file will either be in the main worksheet or in one of the batch tab worksheets.
+
# By editing the metadata with precedence for metadata found on a tab in the .xlsx file, the possibility of creating versions of the metadata by having it in too many places is avoided
+
# If this collection is in progress its .xlsx file will be found in the collection metadata folder
+
# If this collection has been finished its .xlsx file will be found in the storage excel folder with other .xlsx files from other finished collections.
+
# Either way once the metadata has been remediated it will be re-output to tab-delimited Unicode .TXT files and handed off to be uploaded to the storage and web access server to replace the original tab-delimited Unicode .TXT files that were uploaded.
+
  
==FOR ONGOING or LARGE COLLECTIONS==
 
  
The following needs remediation: we are using tab-delimited UTF-8 text, not CSV.  As each chunk of content is digitized, the tab containing the metadata will be pulled out of the main spreadsheet into a separate Excel file, numbered with the batch number (collnum.1.xlsx, collnum.2.xslx, etc.) and that will be deposited for remediation, and exported for MODS creation and upload for storage.  The original reference metadata will contain an extra column for notation of which batch each segment of metadata belongs toIn this fashion, we keep track of which batch we're on.
+
:*If there *is* a collnum.2.xlsx or collnum.2.txt, and there is a collnum.xlsx or collnum.txt in the same directory, we will assume that the latter is the first segment of metadata for the collectionThat is to say, it *is* the collnum.1.xlsx or collnum.1.txt file. 
  
This means that in Storage_Excel, we will wind up with multiple spreadsheets for each of the large or ongoing collections.  This will also be the case in the archive.  If there *is* a collnum.2.xlsx or collnum.2.txt, and there is a collnum.xlsx or collnum.txt, we will assume that the latter is the first segment of metadata for the collection. That is to say, it *is* the collnum.1.xlsx or collnum.1.txt file. 
+
<!--'''Diagram of metadata movement for collections requiring multiple uploads''' this image shows a potential policy: [[Image:metadata_transfer_and_remediation.pdf]]-->
  
'''Diagram of metadata movement for collections requiring multiple uploads''' this image shows a potential policy: [[Image:metadata_transfer_and_remediation.pdf]]
+
==Workflow Slide Presentations==
 +
<!-- [http://intranet.lib.ua.edu/wiki/digcoll/images/6/68/MetadataMovement.DS-Jody.pdf Workflow between Digital Services and Jody regarding delivering Content and MODS for Storage and Delivery]
 +
This PowerPoint, such that it is, has been commented out given that the process it outdated.
 +
-->
 +
 
 +
[http://intranet.lib.ua.edu/wiki/digcoll/images/d/db/MetadataMovement.DS-MU.pdf Workflow between Digital Services and Metadata Services regarding Excel files/metadata] - This does not cover the creation of MODS files or how Metadata Services delivers tab-delimited metadata for long term Storage.

Revision as of 14:24, 19 October 2012

Diagram of metadata movement for content being digitized in Digital Services effective November 2009: File:Metadata.pdf


Contents

Digital Services:

When content is ready for storage:

  1. Put the Excel file into S:\Digital Projects\Administrative\Pipeline\collectionInfo\forMDlib\needsRemediation for Mary.
  2. Export a tab-delimited text file and put it with the content for Storage.
  3. Export a tab-delimited text file to generate MODS according to the instructions here: Making MODS
  4. Upload the MODS to Acumen by following the process outlined here: Uploading_MODS
  5. Delete the text file used to create the MODS.
  6. Notify Jody that new MODS await.

Metadata Services:

  1. Retrieve deposited spreadsheet from S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation, and remediate metadata appropriately.
  2. Move the remediated spreadsheet to S:\Digital Projects\Administrative\collectionInfo\Storage_Excel
  3. Generate MODS records, according to the instructions here: Making MODS
  4. Upload these MODS records to Acumen by following the process outlined here: Uploading_MODS
  5. Export a tab-delimited text file of the remediated metadata.
  6. Deposit the exported tab-delimited collection-level metadata in S:\Digital Projects\Administrative\collectionInfo\JodyPickup.
  7. Notify Jody it's ready for pickup.

Jody: Digital Services

  1. If initial uploaded content includes text files in the Scans directory (i.e. item-level tab-delimited metadata), move those to S:\Digital Projects\Administrative\collectionInfo\forMDlib\itemMD\
  2. Notify Mary they're ready for pickup
  3. Pick up finalized spreadsheets from S:\Digital Projects\Administrative\collectionInfo\JodyPickup and transfer to the deposits directory on libcontent1
  4. Process for correct deposit into long term storage and linking in LOCKSS Manifests

Metadata Transfer and Remediation for Collections Requiring Multiple Uploads

  1. When enough item scans have passed quality control checks to warrant an upload of materials to Storage (i.e. a "batch"), the metadata corresponding to the items to be uploaded is copied out of the original set of metadata and pasted into a new Excel file (.xlsx) by Digital Services. For instructions on how to prepare batch metadata see here: BATCH_Metadata The naming convention for this file is collectionNumber.n.xlsx, where "n" is the numerical value of the batch. Ex: u0008_0000001.2.xlsx = the metadata for the second batch of items for collection u0008_0000001.

    Digital Services will note - in the original collection metadata Excel file - which items correspond to this batch. This shall be done via entering the information into a column in the Excel version of the collection metadata. In this fashion, we keep track of which batch we're on.

    This column and it's cells must be excluded from any subsequent text files that are generated for MODS or Storage purposes.

  2. This subset of the collection metadata is then exported from Excel as a tab delimited .txt file and is placed in the appropriate collection metadata folder in s:\Digital Projects\Digital_Coll_Complete by Digital Services.
  3. The .xlsx version of this metadata subset is placed into S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation by Digital Services.
  4. Metadata Services then applies the same steps as for single batch collections in terms of MODS generation and exporting a .txt files to delivery to Jody for Storage.


  • If there *is* a collnum.2.xlsx or collnum.2.txt, and there is a collnum.xlsx or collnum.txt in the same directory, we will assume that the latter is the first segment of metadata for the collection. That is to say, it *is* the collnum.1.xlsx or collnum.1.txt file.


Workflow Slide Presentations

Workflow between Digital Services and Metadata Services regarding Excel files/metadata - This does not cover the creation of MODS files or how Metadata Services delivers tab-delimited metadata for long term Storage.

Personal tools