Difference between revisions of "For ETDs"

From UA Libraries Digital Services Planning and Documentation
Line 25: Line 25:
then run:
then run:
find . -maxdepth 2 -type f -name "*contents*" -exec sed -i '/^$/d' {} +
find . -maxdepth 2 -type f -name "*contents*" -exec sed -i '/^$/d' {} +
this will remove the new line sed leaves behind
this will remove the new line sed leaves behind

Revision as of 16:22, 10 July 2018

For DSpace

New Set Of Content:

  1. new content will be in /srv/deposits/bornDigital/u0015_0000001/ -- you MUST collect copies AFTER the metadata librarians are done with them and BEFORE the end of month archiving (when they will be dispersed to the preservation archive). New content comes in 3 times a year, and whenever there are corrections. We are dependent upon the metadata librarians to let us know of new content.
  2. log into the InfoTrack database and query the bornDigital table for which files are still under embargo, for example: select id_2009, dateAvailable from bornDigital where datestamp > "date of batch" and dateAvailable > "after todays date"

returns a list of ETD items available after todays date (yyyy-mm-dd)` -- make a list of the dates when the content will be available, with the last 4 digits of the identifier (which will be the DSpace ID assigned). This list will need to be provided to the DSpace admin for assigning embargoes there.

  1. create directory by datestamp in S:\Digital Projects\Other\IncomingDigital\ETD_supp and put supplemental files there from the content directory.
  2. REVIEW supplemental files, supplemental files that have pages, to see if additional file types need to be added to DSpace (https://ir.ua.edu/admin/format-registry)
  3. create directory by datestamp in S:\Public\DigitalServices\DSpace\embargoLift\MODS\ and put MODS there
  4. create datestamped directories in
    1. /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/DC/ and
    2. /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/ETD/
  5. go process MODS per MODS_Transform_For_DSpace
  6. come back to libcontent /srv/scripts/etds/toDspace and rename "all" to all_datestamp; recreate "all". Do same with uploads directory.
  7. run pullAndRename_new path is /srv/scripts/etds/toDspace/
    1. cd into /all and run this command to remove duplicate pdfs. (not sure why this is happening but this is a work around)find . -maxdepth 2 -name 'file_2.pdf' -delete

then run: find . -maxdepth 2 -type f -name "*contents*" -exec sed -i 's/file_2.pdf//g' {} + this will remove the file_2.pdf references from the contents manifest

then run: find . -maxdepth 2 -type f -name "*contents*" -exec sed -i '/^$/d' {} + this will remove the new line sed leaves behind

  1. cd into /all and run this command: `zip batchname item*/*` .
  2. move batchname.zip to uploads directory (optional) example: mv batchname.zip ../uploads/.
  3. transfer the zipped file to S:\Public\DigitalServices\DSpace\uploads
  4. Contact Andrew Parker to log in to DSpace and upload the file to the correct collection; Give him the embargo list as he will have to do that by hand, he will return to you the map file
  5. copy the map file that results to a datestamped file in /srv/scripts/etds/toDspace/mapfiles
  6. open in vi and edit that file to make one entry per line: by using the search and replace command that follows  :%s/ item/\ritem/g
  7. run correctPurls (in /srv/scripts/etds/toDspace) on each mapfile to correct purl pointers to new DSpace location
  8. check results
  9. copy the completed match file to the archive: cp u0015_0000001.dspace.match.txt /srv/archive/u0015/0000001/Documentation/.

other scripts ----

  1. to pick up new ETDs and move them into the correct directory for processing: File:MoveContentBD.txt
  2. to preprocess ETDs, assigning persistent identifiers, as described on Electronic_Theses_and_Dissertations: File:ProcessEtds.txt
  3. to check for embargoes about to be raised: File:CheckEmbargo.txt
  4. to lift embargoes: File:LiftEmbargo.txt
  5. to archive the ETDs: File:RelocatingBD.txt