BATCH Metadata

From UA Libraries Digital Services Planning and Documentation
Revision as of 15:32, 12 November 2012 by Jjcolonnaromano (talk | contribs)

for a Word document of this information with images please see here.

Make list of item identifiers to include in the batch

  1. Use a DOS prompt to run “FOR” loops to generate reports of all .tif files you want to include in the batch. The “FOR” loop is written like this: For /r %i in (*.tif) do echo %~nxi >> output.txt
  2. Open the resulting output files in excel and remove any extensions, or, sub-item level naming with a “LEFT” text formula The “LEFT” formula it written like this: =LEFT(Text,Num_chars) And in our case we use 21 for the number of characters as that includes only up to item level identification.
  3. Drag this formula down the length of the list to populate all the cells with the row adjusted formula
  4. Then select the whole column, copy, and “paste special - values” back over the formulas
  5. While the column is still selected run “remove duplicates” from the “data” tab to reduce the list down to just unique item level identifiers.

Use list of item ids to parse out rows from the descriptive Metadata

  1. Copy the list of unique item level ids from the output .xlsx and temporarily paste them into a blank column in the descriptive metadata for the collection from which you are making the batch.
  2. Select the list of unique ids and the “filename” column on the main sheet of the descriptive metadata, and apply “conditional formatting” “duplicate values”
  3. The item level identifiers in the filename column will now be highlighted if there is a match with an id collected from the items in the scans folder being processed for upload

Make notation of batch in collection descriptive metadata and save

  1. Mark the “staff notes” or “batch column” in the highlighted rows with the number of the batch being processed, then delete the temporary list of unique item level ids and save the collection descriptive metadata.
  2. Sort the whole sheet by the values in the batch column then copy and paste those rows of metadata into a new spreadsheet.
  3. Copy and paste the header row from the descriptive metadata to the batch spreadsheet
  4. Close the collection descriptive metadata without saving the sort.

Save copy of parsed out rows to a separate batch xlsx document

  1. Save the batch metadata as an .xlsx excel spreadsheet file
  2. This file is also ready to be exported from excel as a tab delimited text file for use in making MODS.xml files and upload to the server.

Export tab delimited text document in utf-8 for use in creating MODS.xml

  1. From Excel, Save as “Text (Tab delimited) (*.txt)” and open the resulting file in Notepad++
  2. In Notepad++ select all the text in the file and from the ”Search” drop down open the “replace” window.
  3. First replace “quote tab” with tab then replace “tab quote” with tab. Doing these two replacements removes quotes placed around fields by excel The “tab” character is written as \t so “tab quote” is written like \t”
  4. Encode the batch text file in “UTF-8 without BOM”
  5. Save the. txt file as it is now ready to be used for making MODS.xml files for upload