Quality Control

From UA Libraries Digital Services Planning and Documentation
Revision as of 16:04, 7 November 2016 by Cjchatnik (talk | contribs)

For quality control measures to take during the capture process, see Checking Work By Object. This page refers to the more formal quality control process that happens in preparation for upload.


The Quality Control (QC) process is iterative, and involves at least 2 staff members.

The person whose digitization is being checked performs the first round of QC: run the QC script and make repairs, rerunning it until there are no errors; and THEN do a capture-by-capture review (images: an eyeball check; audio: an auditory check). For both types of material, check metadata item numbers against the items themselves.

Do *not* wait until you have hours of QC to do, because after the first hour, you will make mistakes. This is time consuming and very detail oriented.

After the first round of QC, when you think you have it perfect, pass it to a coworker, who will go through the same process but who will give YOU the feedback to make all the repairs. Continue this cycle until the results are perfect in the view of both team members.

See Scans_folder for instructions on relabeling the Scans folder to show its progress through QC.

First QC Check

Throughout this process, please follow the folder renaming process outlined in Scans_folder.

QC Script

  • The script we use in QC can be found on the Share drive here: S:\Digital Projects\Administrative\scripts\qc\ filenamesAndDupes.pl
  • This program will check for correct and sequential filenames per the collection number in a Scans folder and its corresponding Transcripts folder (if there is one) as well as check for duplicate filenames. It will print out a report in the “output” folder.
  • If it shows errors, please rename the files accordingly and also check the metadata and tracking data to make sure the filename is OK in those spreadsheets. When you are done with the report file this script created, please delete it.
  • After you have corrected the errors, RUN THE SCRIPT AGAIN. If the script still finds errors, correct those and run it again. Keep running the script until it does not show any more errors.
  • The script also asks you if you want to make FITS -- leave this to second QC or after.
  • You should be able to run this program by double clicking on it. You might need to associate the file type with Perl the first time you use it.
  • Note: Mass digitized content and Scrapbook content have separate QC scripts. See Quality Control Scripts for additional information.

Visual Spot Check

Once you've run the script and fixed any problems, please check images.

Prep for upload

Follow the preparation for upload processes as outlined in Preparing_Collections_on_the_S_Drive_for_Online_Delivery_and_Storage. The person performing the recheck for your quality control work is to ALSO verify that these folders are set up properly, and that all necessary files are there and named correctly, before upload scripts are run.


Pass the collection to a coworker so he or she can go through the process again (as the Second QC Check).

Please follow the folder renaming process outlined in Scans_folder.

Second QC Check

The Recheck takes place in the Completed folder.

QC Script and Visual Spot Check

A second person reviews only each 20th image (for many-image items) or one image per item. Otherwise, the second person goes through the exact same process.

Checking Upload Preparation

The person doing second QC also checks to see that the collection has been properly prepared for upload.

  • Verify that all necessary documentation is there, named correctly, and that the folder structure is correct as well.
  • Check all .xml, .txt, and .xlsx files for proper filenames and extensions.
    • order is always like this: coll#.batch#.m0#.[log.]extension
    • log files and collection info xml files do not need m0#s
    • old-style metadata will also omit the m0#
    • for collections without batching, omit the batch number
  • Open files (or preview in Windows Explorer) and look for anomalies and inconsistencies, misspellings, and missing data, etc.
    • Metadata
      • Make sure the box and folder number are included and correct in the spreadsheet, as they are necessary to linking items into the EAD.
      • Ideally, no additional fields such as "Notes" are in the Metadata file. "Notes" as such should be deleted or moved to the appropriate row in the log.txt file.
      • Make sure the Format column in the Metadata file has not been altered to the Time format. If a tab delimited metadata file is opened via Excel (especially by right clicking the file and choosing to open in Excel), the format column if like: 3 p., 4 p., etc. Will get interpreted as 3:00 PM, 4:00 PM, etc. If then resaved as .txt, times will have been saved instead of page #s. The way around this is to have Excel open first, choose Open. Open your text file and while you are being interrogated by Excel about how to import, set the Format column as "Text".

If errors are found, see below. If everything is hunky-dory, please congratulate the person who performed the work on doing a GREAT job (and tell the supervisor! We love to hear this, and it will be reflected on their annual reviews!) and rename the folder Scans_Store_# (appropriate number in place of the #), and tell the person responsible to go ahead with uploads.

Remember: Results of scripts must be clear, and all content completely approved, before you can relabel the directory as Scans_Store_# (appropriate number in place of the #) and clear the content for upload.

Errors found in Second QC check?

Obviously, if errors are found *after* text exports and MODS files are made, then the Excel file needs to be corrected and the text and MODS files remade.

In addition, if errors are found, the person performing the second QC check notifies the supervisor, as this indicates a need for retraining. The supervisor will ensure the entire set of content is reviewed as if in an initial QC check described above.

If the content is being passed to a supervisor, the directory would be relabeled as Scans_Review_#.

Using Bridge to fix tifs

Adobe Bridge is good at making and applying adjustment profiles to large numbers of files however saving your changes on to the file without versioning it is less than strait forward. Here is the deconvolutionated approach. Apply bridge adjustments to tif files


  • Check that the content of the audio matches the Title.
  • Check that the time codes on the metadata match the starting and stopping time of the audio file
  • Also check that the duration of the file noted in the Format Column matches the duration of the time codes. (You'll have to do some math but that's okay it's good for your brain.)
  • Listen for general quality of the audio file. A lot of extra noise or distortion is from the physical condition of the reel so there might not be much you can do about it.
  • That's it!

Additional Info

Commercial QC Software

Click here to access a quick guide to settings/preferences for Adobe Bridge.

Other QC Documents

File:Box and folder check procedure.docx