Quality Control

From UA Libraries Digital Services Planning and Documentation
Revision as of 10:46, 11 February 2013 by Kgmatheny (Talk | contribs)

Jump to: navigation, search

For quality control measures to take during the capture process, see Checking Work By Object. This page refers to the more formal quality control process that happens in preparation for upload.

The Quality Control (QC) process is iterative, and involves at least 2 staff members.

The person whose digitization is being checked performs the first round of QC: run the QC script and make repairs, rerunning it until there are no errors; and THEN do a capture-by-capture review (images: an eyeball check; audio: an auditory check). For both types of material, check metadata item numbers against the items themselves.

Do *not* wait until you have hours of QC to do, because after the first hour, you will make mistakes. This is time consuming and very detail oriented.

After the first round of QC, when you think you have it perfect, pass it to a coworker, who will go through the same process but who will give YOU the feedback to make all the repairs. Continue this cycle until the results are perfect in the view of both team members.

See Scans_folder for instructions on relabeling the Scans folder to show its progress through QC.


QC Check

Throughout this process, please follow the folder renaming process outlined in Scans_folder.

QC Script

  • The script we use in QC can be found on the Share drive here: S:\Digital Projects\Administrative\scripts\qc\ filenamesAndDupes.pl
  • This program will check for correct and sequential filenames per the collection number in a Scans folder and its corresponding Transcripts folder (if there is one) as well as check for duplicate filenames. It will print out a report in the “output” folder.
  • If it shows errors, please rename the files accordingly and also check the metadata and trackingFiles to make sure the filename is OK in those spreadsheets. When you are done with the report file this script created, please delete it.
  • After you have corrected the errors, RUN THE SCRIPT AGAIN. If the script still finds errors, correct those and run it again. Keep running the script until it does not show any more errors.
  • You should be able to run this program by double clicking on it. You might need to associate the file type with Perl the first time you use it.

Visual Spot Check

Once you've run the programs and fixed any problems, please check images.

Prep for upload

Follow the preparation for upload processes as outlined in Preparing_Collections_on_the_S_Drive_for_Online_Delivery_and_Storage. The person performing the recheck for your quality control work is to ALSO verify that these folders are set up properly, and that all necessary files are there and named correctly, before upload scripts are run.


Pass the collection to a coworker so he or she can go through the process again (as the Second QC Check).

Please follow the folder renaming process outlined in Scans_folder.

Second QC Check

The Recheck takes place in the Completed folder, and the content should have been prepped for upload, according to the instructions in Preparing_Collections_on_the_S_Drive_for_Online_Delivery_and_Storage.

A second person reviews only each 20th image (for many-image items) or one image per item. Otherwise, the second person goes through the exact same steps listed above. If errors are found, the person performing the second QC check notifies the supervisor, as this indicates a need for retraining. The supervisor will ensure the entire set of content is reviewed as if in an initial QC check described above. Verify that all necessary files are there, named correctly, and that the folder structure is correct as well.

Spot check all .xml, .txt, and .xlsx files

  • Check all such files for proper filenames and extensions.
  • Open all such files and look for anomalies and inconsistencies, misspellings, and missing data, etc.
    • Ideally, no additional fields such as "Notes" are in the Metadata file. "Notes" as such should be deleted or moved to the appropriate row in the log.txt file.
      • Make sure the Format column in the Metadata file has not been altered to the Time format. If a tab delimited metadata file is opened via Excel (especially by right clicking the file and choosing to open in Excel), the format column if like: 3 p., 4 p., etc. Will get interpreted as 3:00 PM, 4:00 PM, etc. If then resaved as .txt, times will have been saved instead of page #s. The way around this is to have Excel open first, choose Open. Open your text file and while you are being interrogated by Excel about how to import, set the Format column as "Text".

Please follow the folder renaming process outlined in Scans_folder. If the content is being passed to a supervisor, the directory would be relabeled as Scans_Review_#.

Remember: results of scripts must be clear, and all content completely approved, before you can relabel the directory as Scans_Store_# (appropriate number in place of the #) and clear the content for upload.

Obviously, if errors are found *after* text exports and MODS files are made, then the Excel file needs to be corrected and the text and MODS files remade.

If everything is hunky-dory, please congratulate the person who performed the work on doing a GREAT job (and tell the supervisor! We love to hear this, and it will be reflected on their annual reviews!) and rename the folder Scans_Store_# (appropriate number in place of the #), and tell the person responsible to go ahead with uploads.

Additional Info

Commercial QC Software

Click here to access a quick guide to settings/preferences for Adobe Bridge.

Other QC Documents

File:Box and folder check procedure.docx

Personal tools