OCR List

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(New page: '''How to create an OCR List''' # Open the collection's TrackingFiles log file # For the files you are uploading, copy those rows to a new spreadsheet # Delete all columns EXCEPT: ''Filen...)
 
m
Line 3: Line 3:
 
# Open the collection's TrackingFiles log file
 
# Open the collection's TrackingFiles log file
 
# For the files you are uploading, copy those rows to a new spreadsheet
 
# For the files you are uploading, copy those rows to a new spreadsheet
# Delete all columns EXCEPT: ''Filename'' and ''OCR?''
+
# In the new spreadsheet, delete all columns EXCEPT: ''Filename'' and ''OCR?''
 +
#* This will results in a two-column list, with only one filename per line -- which is what the script is looking for
 
#* ''OCR?'' should be filled in with 1 or 0 for each item (1=yes to OCR, more than half of the item is typewritten, 0=no to OCR)
 
#* ''OCR?'' should be filled in with 1 or 0 for each item (1=yes to OCR, more than half of the item is typewritten, 0=no to OCR)
#* if it isn't, take the time to look over the items in Bridge and fill in that column
+
#* If it isn't, take the time to look over the items in Bridge and fill in that column
 
# Save as tab delimited file called [collection number].ocrList.txt  
 
# Save as tab delimited file called [collection number].ocrList.txt  
 
#* Example: u0003_0001577.ocrList.txt
 
#* Example: u0003_0001577.ocrList.txt
 
# Put in collection's Admin folder (in Digital_Coll_Completed directory, of course!)
 
# Put in collection's Admin folder (in Digital_Coll_Completed directory, of course!)

Revision as of 10:14, 23 October 2012

How to create an OCR List

  1. Open the collection's TrackingFiles log file
  2. For the files you are uploading, copy those rows to a new spreadsheet
  3. In the new spreadsheet, delete all columns EXCEPT: Filename and OCR?
    • This will results in a two-column list, with only one filename per line -- which is what the script is looking for
    • OCR? should be filled in with 1 or 0 for each item (1=yes to OCR, more than half of the item is typewritten, 0=no to OCR)
    • If it isn't, take the time to look over the items in Bridge and fill in that column
  4. Save as tab delimited file called [collection number].ocrList.txt
    • Example: u0003_0001577.ocrList.txt
  5. Put in collection's Admin folder (in Digital_Coll_Completed directory, of course!)
Personal tools