Tracking Data
From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
(→Use the tracking data columns during capture) |
|||
Line 6: | Line 6: | ||
# Open the collection's metadata spreadsheet | # Open the collection's metadata spreadsheet | ||
# Open trackingcolumns_template.xlsx (found in S:\Digital Projects\Organization\Digital_Program_Logs\TrackingFiles\TrackingFiles_database_files) | # Open trackingcolumns_template.xlsx (found in S:\Digital Projects\Organization\Digital_Program_Logs\TrackingFiles\TrackingFiles_database_files) | ||
− | # COPY tracking columns to the end of the metadata spreadsheet or enter these column headers manually: | + | # COPY tracking columns to the end of the metadata spreadsheet or enter these column headers manually, just as they are given here: |
Number of Captures | Number of Captures | ||
Captured with | Captured with | ||
Line 13: | Line 13: | ||
OCR? (1=yes or 0=no) | OCR? (1=yes or 0=no) | ||
DS Notes | DS Notes | ||
+ | Metadata changed | ||
4. Save the metadata spreadsheet (and close trackingcolumns_template.xlsx) | 4. Save the metadata spreadsheet (and close trackingcolumns_template.xlsx) | ||
===Use the tracking data columns during capture=== | ===Use the tracking data columns during capture=== | ||
− | * Include data for items that you actually capture, not things you skip (see [[Skipped Items]] for how to track those) | + | * '''Include data for items that you actually capture, not things you skip''' (see [[Skipped Items]] for how to track those) |
+ | * '''If you begin but don't finish an item (even if you're planning to finish it soon), note that in the DS Notes column'''; this especially important for collections that multiple people are working on, or that you're going to put on the backburner for some reason -- basically, make sure someone else picking up the collection will know exactly where you are, even if you're in the middle of an item | ||
* Check your Number of Captures against the Format column in the metadata part of the spreadsheet -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see [[Procedural Anomalies]]) | * Check your Number of Captures against the Format column in the metadata part of the spreadsheet -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see [[Procedural Anomalies]]) | ||
* Be sure to fill in the OCR? column so that Tesseract can do its job | * Be sure to fill in the OCR? column so that Tesseract can do its job |
Latest revision as of 11:04, 6 January 2014
This page refers to the current procedure for recording administrative metadata during the capture process. For older processes of recording tracking data, see TrackingFiles.
Contents |
[edit] Procedure
[edit] Integrate the tracking data columns before capture begins
- Open the collection's metadata spreadsheet
- Open trackingcolumns_template.xlsx (found in S:\Digital Projects\Organization\Digital_Program_Logs\TrackingFiles\TrackingFiles_database_files)
- COPY tracking columns to the end of the metadata spreadsheet or enter these column headers manually, just as they are given here:
Number of Captures Captured with Captured by Date OCR? (1=yes or 0=no) DS Notes Metadata changed
4. Save the metadata spreadsheet (and close trackingcolumns_template.xlsx)
[edit] Use the tracking data columns during capture
- Include data for items that you actually capture, not things you skip (see Skipped Items for how to track those)
- If you begin but don't finish an item (even if you're planning to finish it soon), note that in the DS Notes column; this especially important for collections that multiple people are working on, or that you're going to put on the backburner for some reason -- basically, make sure someone else picking up the collection will know exactly where you are, even if you're in the middle of an item
- Check your Number of Captures against the Format column in the metadata part of the spreadsheet -- the metadata should be made to match the number of captures you make (if there's a big discrepancy, see Procedural Anomalies)
- Be sure to fill in the OCR? column so that Tesseract can do its job
[edit] Remove and save the tracking data columns
- This happens after the spreadsheet has been batched, if necessary
- COPY Filename column from the metadata spreadsheet to a new sheet
- MOVE Tracking Data columns to new sheet
- Make sure you also copy the column headers
- If you don't, you'll get an error on upload -- because the script automatically assumes the first row is the header, it will interpret that your first item in Scans hasn't been entered into the log
- Save new sheet as tab delimited .txt file [collNum.batchNum.log.txt] in the collection's Admin folder
- Example: u0001_2007001.25.log.txt
[edit] Rationale for Change
[edit] Problem
In December of 2012, it was proposed that we rethink the current tracking model, for two reasons:
- Numerous fields in the TrackingFiles document were not being used, and they were overburdened with validation rules
- TrackingFiles documents were separate files, housed in a totally separate location from the rest of the collection's documentation
[edit] Advantages of New System
We then proposed the simplified and integrated tracking data procedure outlined above. This process has many advantages for our workflow:
- There is no longer a need to keep two spreadsheets open during capture, which improves our data collection
- With everything in one spreadsheet, it is easier to
- check the metadata against the tracking data, which allows for more accurate recordkeeping
- isolate batch tracking data (as the entire working metadata/tracking spreadsheet is being "batched")
- The tracking data has been pared down to essential fields but is open to future reconfiguration if necessary
[edit] Effects and Potential Effects
The process has the following impacts on workflow:
- No added step at preparation for capture, just a rewriting of the old one (adding template columns to existing sheet rather than saving template sheet as new document)
- No added step at preparation for upload, just a reworking of the old (transferring and saving the data rather than simply saving a spreadsheet)
- admittedly, the copying/removing procedure introduces risk, but it is a risk we are accustomed to taking: we have always done something similar to extract the OCR list from TrackingFiles
- if the columns are not removed, Archivist Utility will point them out during the MODS-making process, and the problem can be corrected
- The new model contains most of the same data that was effectively used in the old TrackingFiles documents, and can be named and archived in the same way
- Done properly, the new model should not impact the Metadata Unit at all; done poorly, it will not break our current system (columns will be mapped to MODS in a way that doesn't trigger a fatal error)
- Legacy collections will not require manual migration to the new system
- There was nothing inherently broken about the old system, so it can continue where it needs to
- New form tracking can be instituted as outlined above at any point in ongoing collections because we document them in batches