Electronic Theses and Dissertations

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(New page: ==I. File Movement Overview== '''A.''' ProQuest uploads deposits of zip files to the content.lib.ua.edu server via ftp into the ftpaccess home directory, and notifies either Janet Lee-Smel...)
 
Line 1: Line 1:
==I. File Movement Overview==
+
== Workflow Overview==
 
'''A.''' ProQuest uploads deposits of zip files to the content.lib.ua.edu server via ftp into the ftpaccess home directory, and notifies either Janet Lee-Smeltzer or Jody DeRidder of the upload.  If Janet, then Janet notifies Jody.
 
'''A.''' ProQuest uploads deposits of zip files to the content.lib.ua.edu server via ftp into the ftpaccess home directory, and notifies either Janet Lee-Smeltzer or Jody DeRidder of the upload.  If Janet, then Janet notifies Jody.
  
'''B.''' Jody relocates the deposit into a directory named for the date of deposit (yyyymmdd) and copies this entire directory to a working directory for modifications, where she unzips all the content and performs the following tasks:
+
'''B.''' Jody relocates the content into a directory named for the date of deposit (yyyymmdd) in the ftpaccess home directory.  She then copies this entire directory to a working directory for modifications, where she unzips all the content and performs the following tasks:
 
# extracts from each metadata file the following information:   
 
# extracts from each metadata file the following information:   
 
##title,  
 
##title,  
Line 13: Line 13:
 
# calls the InfoTrack.bornDigital mysql database table on libcontent1.lib.ua.edu to find the next filenumber to assign and the InfoTrack.lookup table to determine the next persistent URL;  
 
# calls the InfoTrack.bornDigital mysql database table on libcontent1.lib.ua.edu to find the next filenumber to assign and the InfoTrack.lookup table to determine the next persistent URL;  
 
# records the item number assigned, the PURL assigned, the author and title in these tables
 
# records the item number assigned, the PURL assigned, the author and title in these tables
# inserts the assigned item number (filename, minus the extension) and the assigned PURL into specified (which?) fields in a copy of the metadata  
+
# inserts the assigned item number (filename, minus the extension) into a UA_identifier attribute and the assigned PURL into a UA_purl attribute within the DISS_submission field in a copy of the metadata  
# places this altered copy of the metadata into an XML subdirectory of a directory named for original date of deposit followed by the lowercase letter "a" to indicate the first modifications;  the copy will be named with the assigned filename followed by ".prq.xml" (thus a correctly named file would be:  u0015_0000001_0000023.prq.xml) to indicate this is still ProQuest XML.
+
# places this altered copy of the metadata into an PRQ subdirectory;  the copy will be named with the assigned filename followed by ".prq.xml" (thus a correctly named file would be:  u0015_0000001_0000023.prq.xml) to indicate this is still ProQuest XML.
# copies all the bitstreams and renames them appropriately, placing them in a BITSTREAM subdirectory of the same directory where the XML subdirectory resides. The primary PDF will be named with the assigned filename followed by ".pdf";  subsidiary files will be numbered sequentially, with a 4-digit left-padded number attached to the assigned filename, followed by the extension.  So the first subsidiary file for this file (if a jpeg) would properly be named u0015_0000001_0000023_0001.jpg, and the second (if a text file) would be named u0015_0000001_0000023_0002.txt.
+
# copies all the bitstreams and renames them appropriately, placing them in a CONTENT subdirectory. The primary PDF will be named with the assigned filename followed by ".pdf";  subsidiary files will be numbered sequentially, with a 4-digit left-padded number attached to the assigned filename, followed by the extension.  So the first subsidiary file for this file (if a jpeg) would properly be named u0015_0000001_0000023_0001.jpg, and the second (if a text file) would be named u0015_0000001_0000023_0002.txt.
 +
# creates an entry for each record in a tab-delimited xmlList.txt file which contains the following fields:
 +
## assigned filename
 +
## original filename
 +
## author
 +
## title
 +
## directory?
 +
## an indicator of the existence of subsidiary files
 +
## the embargo code
 +
## the assigned PURL
  
 +
'''C.''' Jody creates a directory in the Metadata Librarian home area under etd_deposits labeled to match the date of deposit (as above:  yyyymmdd). She copies the original, unzipped deposit to an ORIGINAL subdirectory. She also copies the CONTENT and PRQ subdirectories to this directory, and copies the xmlList.txt to this directory also.  She then notifies the Metadata Librarian responsible for the next step.
  
 +
'''D.''' The Metadata Librarian (at this time, Shawn Averkamp) works with the deposited content to create valid MODS files meeting our local profile, which include the assigned identifier and PURL, and are named for the assigned identifier with a ".mods.xml" extension. 
  
specified location for the Metadata Unit:  currently, that location is Shawn Averkamp's home directory on the same server, under the directory etd_deposits.
+
'''E.''' The Metadata Librarian also creates valid MARC files for upload into our OPAC system, which reference the included assigned identifier and PURL, named for the assigned filename with ".mrc.xml" extension.
  
 +
'''F.'''  She places the finishe MODS in a MODS directory next to the PRQ and CONTENT directories, and the finished MARC into a MARC directory next to the PRQ and CONTENT directories.
  
'''D.''' After transformations are complete, the Metadata librarian copies the finalized metadata and bitstreams to a subdirectory named for the date of original deposit, under the ReadyToGo directory, and notifies Jody DeRidder for pickup.
+
'''G. ''' The Metadata Librarian notifies Jody that the records are ready.
  
'''E.'''  Jody transfers the deposited files to long term archival storage, linking them into the LOCKSS Manifest pages for pickup.
+
'''H.'''  Jody copies the deposits to the Deposit subdirectory on the storage server, and runs a script which will move the files into the correct subdirectories for long-term storage, linking them into the LOCKSS manifests.
  
'''F.'''  Jody also transfers copies into the web delivery system for access via Acumen delivery software.
+
'''I.'''  Jody extracts the filenames and embargo codes from the xmlList.txt, adds in the date the embargo starts, and calculates the end of the embargo dates.  She then adds this information to a list or database entry that is checked by the periodic refreshing script.
 +
 
 +
'''J.'''  The periodic refreshing script crawls through the storage directory, picks up new and modified files, checks for embargo dates not yet past, and if this raises no flags, copies the content and MODS to the web directories for online delivery.
 +
 
 +
'''I.''' The Metadata Librarian submits a batch upload of the MARC records into our catalog system.
 +
 
 +
'''J.''' The Metadata Librarian checks the final display and access via both OPAC and digital library system to verify that no problems exist. If any problems are encountered, she contacts Jody and we work out how to fix them.  :-)

Revision as of 16:29, 4 September 2009

Workflow Overview

A. ProQuest uploads deposits of zip files to the content.lib.ua.edu server via ftp into the ftpaccess home directory, and notifies either Janet Lee-Smeltzer or Jody DeRidder of the upload. If Janet, then Janet notifies Jody.

B. Jody relocates the content into a directory named for the date of deposit (yyyymmdd) in the ftpaccess home directory. She then copies this entire directory to a working directory for modifications, where she unzips all the content and performs the following tasks:

  1. extracts from each metadata file the following information:
    1. title,
    2. author,
    3. subsidiary file information:
      1. description
      2. filename
      3. file type
    4. embargo code (if any)
  2. calls the InfoTrack.bornDigital mysql database table on libcontent1.lib.ua.edu to find the next filenumber to assign and the InfoTrack.lookup table to determine the next persistent URL;
  3. records the item number assigned, the PURL assigned, the author and title in these tables
  4. inserts the assigned item number (filename, minus the extension) into a UA_identifier attribute and the assigned PURL into a UA_purl attribute within the DISS_submission field in a copy of the metadata
  5. places this altered copy of the metadata into an PRQ subdirectory; the copy will be named with the assigned filename followed by ".prq.xml" (thus a correctly named file would be: u0015_0000001_0000023.prq.xml) to indicate this is still ProQuest XML.
  6. copies all the bitstreams and renames them appropriately, placing them in a CONTENT subdirectory. The primary PDF will be named with the assigned filename followed by ".pdf"; subsidiary files will be numbered sequentially, with a 4-digit left-padded number attached to the assigned filename, followed by the extension. So the first subsidiary file for this file (if a jpeg) would properly be named u0015_0000001_0000023_0001.jpg, and the second (if a text file) would be named u0015_0000001_0000023_0002.txt.
  7. creates an entry for each record in a tab-delimited xmlList.txt file which contains the following fields:
    1. assigned filename
    2. original filename
    3. author
    4. title
    5. directory?
    6. an indicator of the existence of subsidiary files
    7. the embargo code
    8. the assigned PURL

C. Jody creates a directory in the Metadata Librarian home area under etd_deposits labeled to match the date of deposit (as above: yyyymmdd). She copies the original, unzipped deposit to an ORIGINAL subdirectory. She also copies the CONTENT and PRQ subdirectories to this directory, and copies the xmlList.txt to this directory also. She then notifies the Metadata Librarian responsible for the next step.

D. The Metadata Librarian (at this time, Shawn Averkamp) works with the deposited content to create valid MODS files meeting our local profile, which include the assigned identifier and PURL, and are named for the assigned identifier with a ".mods.xml" extension.

E. The Metadata Librarian also creates valid MARC files for upload into our OPAC system, which reference the included assigned identifier and PURL, named for the assigned filename with ".mrc.xml" extension.

F. She places the finishe MODS in a MODS directory next to the PRQ and CONTENT directories, and the finished MARC into a MARC directory next to the PRQ and CONTENT directories.

G. The Metadata Librarian notifies Jody that the records are ready.

H. Jody copies the deposits to the Deposit subdirectory on the storage server, and runs a script which will move the files into the correct subdirectories for long-term storage, linking them into the LOCKSS manifests.

I. Jody extracts the filenames and embargo codes from the xmlList.txt, adds in the date the embargo starts, and calculates the end of the embargo dates. She then adds this information to a list or database entry that is checked by the periodic refreshing script.

J. The periodic refreshing script crawls through the storage directory, picks up new and modified files, checks for embargo dates not yet past, and if this raises no flags, copies the content and MODS to the web directories for online delivery.

I. The Metadata Librarian submits a batch upload of the MARC records into our catalog system.

J. The Metadata Librarian checks the final display and access via both OPAC and digital library system to verify that no problems exist. If any problems are encountered, she contacts Jody and we work out how to fix them.  :-)

Personal tools