Tracking Data

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(Use the tracking data columns during capture)
(3 intermediate revisions by one user not shown)
Line 16: Line 16:
  
 
===Use the tracking data columns during capture===
 
===Use the tracking data columns during capture===
* Be sure to fill in the OCR? column
+
* Include data for items that you actually capture, not things you skip (see [[Skipped Items]] for how to track those)
* Check your Number of Captures against the Format column -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see [[Procedural Anomalies]])
+
* Check your Number of Captures against the Format column in the metadata part of the spreadsheet -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see [[Procedural Anomalies]])
 +
* Be sure to fill in the OCR? column so that Tesseract can do its job
  
 
===Remove and save the tracking data columns===
 
===Remove and save the tracking data columns===
# This happens after the spreadsheet has been batched, if necessary
+
* This happens after the spreadsheet has been batched, if necessary
# COPY Filename column from the metadata spreadsheet to a new sheet
+
* COPY Filename column from the metadata spreadsheet to a new sheet
# MOVE Tracking Data columns to new sheet
+
* MOVE Tracking Data columns to new sheet
# Save new sheet as tab delimited .txt file [collNum.batchNum.log.txt] in the collection's Admin folder
+
* Make sure you also copy the column headers
 +
** If you don't, you'll get an error on upload -- because the script automatically assumes the first row is the header, it will interpret that your first item in Scans hasn't been entered into the log
 +
* Save new sheet as tab delimited .txt file [collNum.batchNum.log.txt] in the collection's Admin folder
 +
** Example: u0001_2007001.25.log.txt
  
 
==Rationale for Change==
 
==Rationale for Change==

Revision as of 07:54, 25 April 2013

This page refers to the current procedure for recording administrative metadata during the capture process. For older processes of recording tracking data, see TrackingFiles.

Contents

Procedure

Integrate the tracking data columns before capture begins

  1. Open the collection's metadata spreadsheet
  2. Open trackingcolumns_template.xlsx (found in S:\Digital Projects\Organization\Digital_Program_Logs\TrackingFiles\TrackingFiles_database_files)
  3. COPY tracking columns to the end of the metadata spreadsheet or enter these column headers manually:
    Number of Captures
    Captured with
    Captured by
    Date
    OCR? (1=yes or 0=no)
    DS Notes

4. Save the metadata spreadsheet (and close trackingcolumns_template.xlsx)

Use the tracking data columns during capture

  • Include data for items that you actually capture, not things you skip (see Skipped Items for how to track those)
  • Check your Number of Captures against the Format column in the metadata part of the spreadsheet -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see Procedural Anomalies)
  • Be sure to fill in the OCR? column so that Tesseract can do its job

Remove and save the tracking data columns

  • This happens after the spreadsheet has been batched, if necessary
  • COPY Filename column from the metadata spreadsheet to a new sheet
  • MOVE Tracking Data columns to new sheet
  • Make sure you also copy the column headers
    • If you don't, you'll get an error on upload -- because the script automatically assumes the first row is the header, it will interpret that your first item in Scans hasn't been entered into the log
  • Save new sheet as tab delimited .txt file [collNum.batchNum.log.txt] in the collection's Admin folder
    • Example: u0001_2007001.25.log.txt

Rationale for Change

Problem

In December of 2012, it was proposed that we rethink the current tracking model, for two reasons:

  1. Numerous fields in the TrackingFiles document were not being used, and they were overburdened with validation rules
  2. TrackingFiles documents were separate files, housed in a totally separate location from the rest of the collection's documentation

Advantages of New System

We then proposed the simplified and integrated tracking data procedure outlined above. This process has many advantages for our workflow:

  • There is no longer a need to keep two spreadsheets open during capture, which improves our data collection
  • With everything in one spreadsheet, it is easier to
    • check the metadata against the tracking data, which allows for more accurate recordkeeping
    • isolate batch tracking data (as the entire working metadata/tracking spreadsheet is being "batched")
  • The tracking data has been pared down to essential fields but is open to future reconfiguration if necessary

Effects and Potential Effects

The process has the following impacts on workflow:

  • No added step at preparation for capture, just a rewriting of the old one (adding template columns to existing sheet rather than saving template sheet as new document)
  • No added step at preparation for upload, just a reworking of the old (transferring and saving the data rather than simply saving a spreadsheet)
    • admittedly, the copying/removing procedure introduces risk, but it is a risk we are accustomed to taking: we have always done something similar to extract the OCR list from TrackingFiles
    • if the columns are not removed, Archivist Utility will point them out during the MODS-making process, and the problem can be corrected
  • The new model contains most of the same data that was effectively used in the old TrackingFiles documents, and can be named and archived in the same way
  • Done properly, the new model should not impact the Metadata Unit at all; done poorly, it will not break our current system (columns will be mapped to MODS in a way that doesn't trigger a fatal error)
  • Legacy collections will not require manual migration to the new system
    • There was nothing inherently broken about the old system, so it can continue where it needs to
    • New form tracking can be instituted as outlined above at any point in ongoing collections because we document them in batches
Personal tools