For ETDs

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(For DSpace)
 
(35 intermediate revisions by 3 users not shown)
Line 1: Line 1:
# to pick up new ETDs and move them into the correct directory for processing:  [[Image:moveContentBD.txt]]
+
 
# to preprocess ETDs, assigning persistent identifiers, as described on [[Electronic_Theses_and_Dissertations]]:  [[Image:processEtds.txt]]
+
# to check for embargoes about to be raised: [[Image:checkEmbargo.txt]]
+
# to lift embargoes:  [[Image:liftEmbargo.txt]]
+
# to archive the ETDs:  [[Image:relocatingBD.txt]]
+
  
 
----------------------------------------------------------------------
 
----------------------------------------------------------------------
 
==For DSpace ==
 
==For DSpace ==
  
<strong>Embargo Lift Process:</strong>
 
  
#  place embargoLift.xml (datestamped) in S:\Public\DigitalServices\DSpace\embargoLift
+
<strong>New Set Of Content</strong>
# on libcontent server, go to /srv/scripts/etds/toDspace and run script like so:  `pullEmbargoed 20160901 MODS`
+
#  go to the share directory above, and using Project view in XMLSpy, process the MODS now in  S:\Public\DigitalServices\DSpace\embargoLift\datestamp\ with mods_dspace...xsl      and mods_dspace_etd.xsl in S:\Public\DigitalServices\DSpace\,  writing files to datestamped directories in the DC and ETD directories in embargoLift directory
+
#  check supplemental files and zip if needed, in S:\Digital Projects\Other\IncomingDigital\ETD_supp
+
#  come back to libcontent /srv/scripts/etds/toDspace and run script like so:  `pullEmbargoed 20160901`
+
#  zip the files with the zipThese script
+
#  transfer the zipped files to S:\Public\DigitalServices\DSpace\uploads
+
#  log in to DSpace and upload the files to the correct collection
+
#  copy the map file that results to a datestamped file in /srv/scripts/etds/toDspace/mapfiles
+
#  open and edit that file to make one entry per line:  :%s, item,\ritem,g
+
#  run correctPurls  to correct purl pointers to new DSpace location
+
# check results
+
  
 +
# new content will be in /srv/deposits/bornDigital/u0015_0000001/  -- you MUST collect copies AFTER the metadata librarians are done with them and BEFORE the end of month archiving (when they will be dispersed to the preservation archive).  New content comes in 3 times a year, and whenever there are corrections. We are dependent upon the metadata librarians to let us know of new content.
 +
# log into the InfoTrack database and query the bornDigital table for which files are still under embargo, for example:  select id_2009, dateAvailable from bornDigital where datestamp > "date of batch" and dateAvailable > "after todays date"
 +
returns a list of ETD items available after todays date (yyyy-mm-dd)`  -- make a list of the dates when the content will be available, with the last 4 digits of the identifier (which will be the DSpace ID assigned).  This list will need to be provided to the DSpace admin for assigning embargoes there.
 +
# create directory by datestamp in S:\Digital Projects\Other\IncomingDigital\ETD_supp  and put supplemental files there from the content directory.
 +
#  REVIEW supplemental files, supplemental files that have pages, to see if additional file types need to be added to DSpace (https://ir.ua.edu/admin/format-registry)
 +
# create directory by datestamp in S:\Public\DigitalServices\DSpace\embargoLift\MODS\ and put MODS there
 +
# create datestamped directories in
 +
## /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/DC/ and
 +
## /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/ETD/
 +
# go process MODS per [[MODS_Transform_For_DSpace]]
 +
#  come back to libcontent /srv/scripts/etds/toDspace and rename "all" to all_datestamp; recreate "all".  Do same with uploads directory.
 +
# run '''pullAndRename_new''' path is /srv/scripts/etds/toDspace/
 +
## cd into /all and run this command to remove duplicate pdfs. (not sure why this is happening but this is a work around)find . -maxdepth 2 -name 'file_2.pdf' -delete
 +
then run:
 +
find . -maxdepth 2 -type f -name "*contents*" -exec sed -i 's/file_2.pdf//g' {} +
 +
this will remove the file_2.pdf references from the contents manifest(try this command with file_2.pdf\n to remove the linefeed in one move)
  
<strong>New Set Of Content:</strong>
+
then run:
 +
find . -maxdepth 2 -type f -name "*contents*" -exec sed -i '/^$/d' {} +
 +
this will remove the new line sed leaves behind
  
# log in to DSpace and set up new collection (if necessary)
+
# cd into /all and run this command:  `zip batchname item*/*` .
place xmlList.xml (datestamped) in S:\Public\DigitalServices\DSpace\newContent
+
move batchname.zip to uploads directory (optional) example: mv batchname.zip ../uploads/.
#  place MODS in S:\Public\DigitalServices\DSpace\newContent\MODS\datestamp
+
#  transfer the zipped file to S:\Public\DigitalServices\DSpace\uploads
#  place PDFs in S:\Public\DigitalServices\DSpace\newContent\PDFs\datestamp
+
Contact Andrew Parker to log in to DSpace and upload the file to the correct collection; Give him the embargo list as he will have to do that by hand, he will return to you the map file
#  place supplemental files in S:\Digital Projects\Other\IncomingDigital\ETD_supp  and zip if needed
+
# using Project view in XMLSpy, process the MODS now in  S:\Public\DigitalServices\DSpace\embargoLift\datestamp\ with mods_dspace...xsl      and mods_dspace_etd.xsl in S:\Public\DigitalServices\DSpace\,  writing files to datestamped directories in the DC and ETD directories in embargoLift directory
+
#  on libcontent server, go to /srv/scripts/etds/toDspace and run script like so:  `pullNewContent 20160901`
+
#  zip the files with the zipThese script there
+
#  transfer the zipped files to S:\Public\DigitalServices\DSpace\uploads
+
#  log in to DSpace and upload the files to the correct collection
+
 
#  copy the map file that results to a datestamped file in /srv/scripts/etds/toDspace/mapfiles
 
#  copy the map file that results to a datestamped file in /srv/scripts/etds/toDspace/mapfiles
#  open and edit that file to make one entry per line:  :%s, item,\ritem,g
+
#  open in vi and edit that file to make one entry per line: by using the search and replace command that follows   :%s/ item/\ritem/g
# run correctPurls to correct purl pointers to new DSpace location
+
# run correctPurls (in /srv/scripts/etds/toDspace) on each mapfile to correct purl pointers to new DSpace location
 
# check results
 
# check results
 +
# copy the completed match file to the archive: cp u0015_0000001.dspace.match.txt /srv/archive/u0015/0000001/Documentation/.
 +
 +
 +
------- other scripts ----
 +
 +
# to pick up new ETDs and move them into the correct directory for processing:  [[Image:moveContentBD.txt]]
 +
# to preprocess ETDs, assigning persistent identifiers, as described on [[Electronic_Theses_and_Dissertations]]:  [[Image:processEtds.txt]]
 +
# to check for embargoes about to be raised: [[Image:checkEmbargo.txt]]
 +
# to lift embargoes:  [[Image:liftEmbargo.txt]]
 +
# to archive the ETDs:  [[Image:relocatingBD.txt]]

Latest revision as of 15:40, 10 July 2018



[edit] For DSpace

New Set Of Content:

  1. new content will be in /srv/deposits/bornDigital/u0015_0000001/ -- you MUST collect copies AFTER the metadata librarians are done with them and BEFORE the end of month archiving (when they will be dispersed to the preservation archive). New content comes in 3 times a year, and whenever there are corrections. We are dependent upon the metadata librarians to let us know of new content.
  2. log into the InfoTrack database and query the bornDigital table for which files are still under embargo, for example: select id_2009, dateAvailable from bornDigital where datestamp > "date of batch" and dateAvailable > "after todays date"

returns a list of ETD items available after todays date (yyyy-mm-dd)` -- make a list of the dates when the content will be available, with the last 4 digits of the identifier (which will be the DSpace ID assigned). This list will need to be provided to the DSpace admin for assigning embargoes there.

  1. create directory by datestamp in S:\Digital Projects\Other\IncomingDigital\ETD_supp and put supplemental files there from the content directory.
  2. REVIEW supplemental files, supplemental files that have pages, to see if additional file types need to be added to DSpace (https://ir.ua.edu/admin/format-registry)
  3. create directory by datestamp in S:\Public\DigitalServices\DSpace\embargoLift\MODS\ and put MODS there
  4. create datestamped directories in
    1. /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/DC/ and
    2. /cifs-mount12/Public/DigitalServices/DSpace/embargoLift/ETD/
  5. go process MODS per MODS_Transform_For_DSpace
  6. come back to libcontent /srv/scripts/etds/toDspace and rename "all" to all_datestamp; recreate "all". Do same with uploads directory.
  7. run pullAndRename_new path is /srv/scripts/etds/toDspace/
    1. cd into /all and run this command to remove duplicate pdfs. (not sure why this is happening but this is a work around)find . -maxdepth 2 -name 'file_2.pdf' -delete

then run: find . -maxdepth 2 -type f -name "*contents*" -exec sed -i 's/file_2.pdf//g' {} + this will remove the file_2.pdf references from the contents manifest(try this command with file_2.pdf\n to remove the linefeed in one move)

then run: find . -maxdepth 2 -type f -name "*contents*" -exec sed -i '/^$/d' {} + this will remove the new line sed leaves behind

  1. cd into /all and run this command: `zip batchname item*/*` .
  2. move batchname.zip to uploads directory (optional) example: mv batchname.zip ../uploads/.
  3. transfer the zipped file to S:\Public\DigitalServices\DSpace\uploads
  4. Contact Andrew Parker to log in to DSpace and upload the file to the correct collection; Give him the embargo list as he will have to do that by hand, he will return to you the map file
  5. copy the map file that results to a datestamped file in /srv/scripts/etds/toDspace/mapfiles
  6. open in vi and edit that file to make one entry per line: by using the search and replace command that follows  :%s/ item/\ritem/g
  7. run correctPurls (in /srv/scripts/etds/toDspace) on each mapfile to correct purl pointers to new DSpace location
  8. check results
  9. copy the completed match file to the archive: cp u0015_0000001.dspace.match.txt /srv/archive/u0015/0000001/Documentation/.



other scripts ----
  1. to pick up new ETDs and move them into the correct directory for processing: File:MoveContentBD.txt
  2. to preprocess ETDs, assigning persistent identifiers, as described on Electronic_Theses_and_Dissertations: File:ProcessEtds.txt
  3. to check for embargoes about to be raised: File:CheckEmbargo.txt
  4. to lift embargoes: File:LiftEmbargo.txt
  5. to archive the ETDs: File:RelocatingBD.txt
Personal tools