For Quality Control

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Quality Control checks happen in multiple parts of the work flow pipeline.  
 
Quality Control checks happen in multiple parts of the work flow pipeline.  
  
 +
==On Windows Share Drive, During Digitization==
  
[[Image:filenames.txt]] is a Windows Perl script for locating badly formed and mislocated file names.
+
[[Image:filenames.txt]] is a Windows Perl script for locating badly formed and mislocated file names.  This version of the script is also known as 'filenamesAndTranscripts' as it checks transcript directories also, if they exist.
  
 
[[Image:numfiles.txt]] is a Windows Perl script that looks through scans directories in a selected collection directory,
 
[[Image:numfiles.txt]] is a Windows Perl script that looks through scans directories in a selected collection directory,
 
counts up all files of input extension chosen, and  
 
counts up all files of input extension chosen, and  
 
outputs txt file tab delimiting item number followed by number of files
 
outputs txt file tab delimiting item number followed by number of files
 
  
 +
[[Image:BoxFolderCheck.txt]] is a Windows OR Macintosh Perl script designed for digital content where the item number section of the identifier includes the box and folder number where it is to be linked into the EAD finding aid, and the digital files are located in directories named for the box, and in subdirectories named for the folder.  This verifies that files are named appropriately and also located in directories which reflect their file name. 
  
  
 +
== On Linux Server, for Web Delivery==
  
  
 +
Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is [[Image:TestDeposits.txt]]  -- for the Cabaniss content, it's [[Image:testNums.txt]].
  
 +
To test content online in Acumen, and locate items that have no derivatives or no MODS record, use this Linux Perl script:  [[Image:findMissing.txt]]
  
 +
For Cabaniss, this one checks the content in the web directory (in Acumen) against what's in the storage archive, to make sure nothing is missing:  [[Image:findMissingFile.txt]]
  
 +
To create OCR files for items listed in *ocrList.txt files located in the /srv/deposits/ocrMe directory, and place those OCR files in the correct web location:  [[Image:ocrSelected.txt]]
  
Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is [[Image:TestIncoming.txt]]
+
== On Linux Server, the Storage Archive ==
 +
 
 +
Checking the MD5 checksums of content stored prior to each full-tape backup:  [[Image:Dirs.txt]] (as described in [[Watching Our Backs]])

Revision as of 15:22, 20 October 2010

Quality Control checks happen in multiple parts of the work flow pipeline.

On Windows Share Drive, During Digitization

File:Filenames.txt is a Windows Perl script for locating badly formed and mislocated file names. This version of the script is also known as 'filenamesAndTranscripts' as it checks transcript directories also, if they exist.

File:Numfiles.txt is a Windows Perl script that looks through scans directories in a selected collection directory, counts up all files of input extension chosen, and outputs txt file tab delimiting item number followed by number of files

File:BoxFolderCheck.txt is a Windows OR Macintosh Perl script designed for digital content where the item number section of the identifier includes the box and folder number where it is to be linked into the EAD finding aid, and the digital files are located in directories named for the box, and in subdirectories named for the folder. This verifies that files are named appropriately and also located in directories which reflect their file name.


On Linux Server, for Web Delivery

Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is File:TestDeposits.txt -- for the Cabaniss content, it's File:TestNums.txt.

To test content online in Acumen, and locate items that have no derivatives or no MODS record, use this Linux Perl script: File:FindMissing.txt

For Cabaniss, this one checks the content in the web directory (in Acumen) against what's in the storage archive, to make sure nothing is missing: File:FindMissingFile.txt

To create OCR files for items listed in *ocrList.txt files located in the /srv/deposits/ocrMe directory, and place those OCR files in the correct web location: File:OcrSelected.txt

On Linux Server, the Storage Archive

Checking the MD5 checksums of content stored prior to each full-tape backup: File:Dirs.txt (as described in Watching Our Backs)

Personal tools