For quality control measures to take during the capture process, see Checking Work By Object. This page refers to the more formal quality control process that happens in preparation for upload.
The Quality Control (QC) process is iterative, and involves at least 2 staff members.
The person whose digitization is being checked performs the first round of QC: run the QC script and make repairs, rerunning it until there are no errors; and THEN do a capture-by-capture review (images: an eyeball check; audio: an auditory check). For both types of material, check metadata item numbers against the items themselves.
Do *not* wait until you have hours of QC to do, because after the first hour, you will make mistakes. This is time consuming and very detail oriented.
After the first round of QC, when you think you have it perfect, pass it to a coworker, who will go through the same process but who will give YOU the feedback to make all the repairs. Continue this cycle until the results are perfect in the view of both team members.
See Scans_folder for instructions on relabeling the Scans folder to show its progress through QC.
First QC Check
Throughout this process, please follow the folder renaming process outlined in Scans_folder.
- The script we use in QC can be found on the Share drive here: S:\Digital Projects\Administrative\scripts\qc\ filenamesAndDupes.pl
- This program will check for correct and sequential filenames per the collection number in a Scans folder and its corresponding Transcripts folder (if there is one) as well as check for duplicate filenames. It will print out a report in the “output” folder.
- If it shows errors, please rename the files accordingly and also check the metadata and tracking data to make sure the filename is OK in those spreadsheets. When you are done with the report file this script created, please delete it.
- After you have corrected the errors, RUN THE SCRIPT AGAIN. If the script still finds errors, correct those and run it again. Keep running the script until it does not show any more errors.
- You should be able to run this program by double clicking on it. You might need to associate the file type with Perl the first time you use it.
- Note: Mass digitized content and Scrapbook content have separate QC scripts. See Quality Control Scripts for additional information.
Visual Spot Check
Once you've run the script and fixed any problems, please check images.
- Open every object folder with Adobe Bridge.
- Look for the things listed in this Quality Control Checklist
Prep for upload
Follow the preparation for upload processes as outlined in Preparing_Collections_on_the_S_Drive_for_Online_Delivery_and_Storage. The person performing the recheck for your quality control work is to ALSO verify that these folders are set up properly, and that all necessary files are there and named correctly, before upload scripts are run.
Pass the collection to a coworker so he or she can go through the process again (as the Second QC Check).
Please follow the folder renaming process outlined in Scans_folder.
Second QC Check
The Recheck takes place in the Completed folder.
- Content should have been prepped for upload, according to the instructions in Preparing_Collections_on_the_S_Drive_for_Online_Delivery_and_Storage.
- Please follow the folder renaming process outlined in Scans_folder.
- A reference guide with materials common to our pipeline and their optimal exposure values. Material_exposure_reference_guide_for_Second_QC
QC Script and Visual Spot Check
A second person reviews only each 20th image (for many-image items) or one image per item. Otherwise, the second person goes through the exact same process.
Checking Upload Preparation
The person doing second QC also checks to see that the collection has been properly prepared for upload.
- Verify that all necessary documentation is there, named correctly, and that the folder structure is correct as well.
- Check all .xml, .txt, and .xlsx files for proper filenames and extensions.
- order is always like this: coll#.batch#.m0#.[log.]extension
- log files and collection info xml files do not need m0#s
- old-style metadata will also omit the m0#
- for collections without batching, omit the batch number
- Open files (or preview in Windows Explorer) and look for anomalies and inconsistencies, misspellings, and missing data, etc.
- Make sure the box and folder number are included and correct in the spreadsheet, as they are necessary to linking items into the EAD.
- Ideally, no additional fields such as "Notes" are in the Metadata file. "Notes" as such should be deleted or moved to the appropriate row in the log.txt file.
- Make sure the Format column in the Metadata file has not been altered to the Time format. If a tab delimited metadata file is opened via Excel (especially by right clicking the file and choosing to open in Excel), the format column if like: 3 p., 4 p., etc. Will get interpreted as 3:00 PM, 4:00 PM, etc. If then resaved as .txt, times will have been saved instead of page #s. The way around this is to have Excel open first, choose Open. Open your text file and while you are being interrogated by Excel about how to import, set the Format column as "Text".
If errors are found, see below. If everything is hunky-dory, please congratulate the person who performed the work on doing a GREAT job (and tell the supervisor! We love to hear this, and it will be reflected on their annual reviews!) and rename the folder Scans_Store_# (appropriate number in place of the #), and tell the person responsible to go ahead with uploads.
Remember: Results of scripts must be clear, and all content completely approved, before you can relabel the directory as Scans_Store_# (appropriate number in place of the #) and clear the content for upload.
Errors found in Second QC check?
Obviously, if errors are found *after* text exports and MODS files are made, then the Excel file needs to be corrected and the text and MODS files remade.
In addition, if errors are found, the person performing the second QC check notifies the supervisor, as this indicates a need for retraining. The supervisor will ensure the entire set of content is reviewed as if in an initial QC check described above.
If the content is being passed to a supervisor, the directory would be relabeled as Scans_Review_#.
Commercial QC Software
Click here to access a quick guide to settings/preferences for Adobe Bridge.