Mass Digitization Workflow

From UA Libraries Digital Services Planning and Documentation
Revision as of 11:08, 31 May 2013 by Kgmatheny (Talk | contribs)

Jump to: navigation, search

Mass digitization is a way of digitizing a collection whereby items are linking out from the EAD with minimal metadata. It allows collections to go online more quickly and cheaply, as there is no descriptive metadata to be created prior to digitization or be remediated following digitization (at least not immediately; should there be time metadata can be created at a later point). Collections that are good candidates for mass digitization usually have a detailed EAD Finding Aid, including descriptive headings and folder names. This will allow users to locate materials more easily.

See the following pages for collection-specific workflows: Septimus D. Cabaniss Papers Pauline Jones Gandrud Papers


The information below is for small to mid-sized collections where the box and folder number can be incorporated into the item number section of the file name.

Contents

Prepare Digital Folder Structure on the S Drive

  • Within the scans directory, create a "Box" directory for each physical box and a "Folder" directory for each folder within the corresponding box.
    • For example, if you were digitizing the items in box 4, folder 2, they would have this folder structure: Scans/Box_04/Folder_02
  • Make a new digital "Box" directory within the Scans directory for each physical box.
  • Make a new digital "Folder" directory within the corresponding "Box" directory for each physical folder in the box.


File Naming

Items will be assigned descriptive filenames which include the box, folder, and item number. As the number of digits used for boxes and folders can differ, it is possible that the exact filenaming structure may differ from collection to collection. Included here is simply an example of how the filenames were structured during the mass digitization of the Septimus D. Cabaniss Papers.

  • The first two sections of the filename will be assigned by the type of material and the collection number, as described on this page: File naming schemes.
  • The last 7-digit segment denotes box (B), folder (F), and item number(I) in the format: BBFFIII.tif.
    • For example, you are scanning manuscript collection 1234, in Box 4, Folder 2, beginning with the first item. This item's filename would look like this: u0003_0001234_0402001.tif.
  • Please note that this numbering systems only works for collections containing no more than 99 boxes, boxes containing no more than 99 folders, and folders containing no more than 999 items.


Scanning

  • Scanning will done as normal, with careful attention paid to the filenaming structure and uploading scans to the proper created scans directories.
  • Because these collections may not have been recently processed, items may be held together with staples or other fasteners. As you scan, staples and metal paper clips should be removed carefully and replaced with a plastic paper clip. For other fasteners, such as brads, you may need to consult with archivists before attempting to remove them.


Quality Control

  • Run the QC script created for mass digitization QC, called BoxFolderCheck.pl. It can be found here: S:\Digital Projects\Administrative\scripts\qc.
  • Correct any problems the script discovers.
  • Spot-check images.
    • Open every object folder with Adobe Bridge.
    • Verify what you see against the number of scans recorded in Tracking Files.
    • Check for general image quality (cropping, alignment, colors, etc.)


Getting Content Live in Acumen

Please visit this page to view detailed instructions on how to upload a mass digitized collection.

Personal tools