The following policies have been adopted to deal with procedural anomalies: irregularities encountered during the digitization process, either with the physical state of the materials or their intellectual organization and metadata.
In general, we are operating under these assumptions:
- Digitized items are surrogates for analog items, so the experience of looking at an item online should as much as possible approximate the experience of viewing an object in person.
- Digitization can be a means of conserving archival items, but only if the digitization process is not itself destructive.
- 1 Condition of items
- 1.1 Metal paperclips, pins, and other removable fasteners
- 1.2 Staples, brads, and other hard-to-remove fasteners
- 1.3 Pages stuck together
- 1.4 Pages creased
- 1.5 Things taped or glued to an item
- 1.6 Manuscript item in plastic sleeve
- 1.7 Interleaved material found in bound item
- 1.8 Very large item
- 1.9 Very fragile paper or binding
- 1.10 More than one right side up
- 1.11 Translucent materials
- 2 Discrepancies with Metadata
- 2.1 Item not in box/folder
- 2.2 Item not in metadata: Manuscripts/auto-numbered by us
- 2.3 Item not in metadata: Photos/pre-numbered
- 2.4 Item has fewer/more pages than metadata indicates
- 2.5 The item has blank pages you didn't scan
- 2.6 Multiple items seem to constitute one object
- 2.7 Letter out of order: logical flow of letter does not match physical sequence of pages
Condition of items
Metal paperclips, pins, and other removable fasteners
- If the fastener can be safely removed, do so…carefully, then capture
- If it appears as though something was clipped/pinned into place to obscure the original text, capture both the clipped/pinned piece and the text underneath -- think about it: if you were holding the object in your hands, you'd be able to remove the clip/pin and get to that obscured text if you wanted to
- When returning to box/folder, throw away old metal fasteners and replace with plastic clips
Staples, brads, and other hard-to-remove fasteners
- DO NOT REMOVE ANYTHING THAT MIGHT DAMAGE THE ITEM: when in doubt, consult an archivist
- Capture the item carefully, with fasteners in place
- It should be possible to work around a corner or side fastener; laying open or folding back the paper is okay, just DON'T CREASE it
Pages stuck together
- DO NOT SEPARATE THEM
- Opening a folded section of paper is okay
- Opening a creased section of paper is not okay
- What makes something creased? The severity of the folded edge or presence of multiple compounded folds, coupled with the fragility of old paper. If pulling back fold seems like it will damage the paper, it is creased, in our vernacular.
- DO NOT UN-CREASE ANYTHING. Capture the page anyway, unless you feel the area obscured by the crease is so great or important that the scan would be worthless. In that case, consult Jeremiah or Jody about whether to proceed.
- For example, if it's as bad as this...
...don't bother with capture.
Things taped or glued to an item
- DO NOT REMOVE ANYTHING THAT MIGHT DAMAGE THE ITEM
- Scan the item as-is
- If the taped/glued object is hanging (not completely stuck down), it might be possible to push it back so that it blocks less of or less important parts of the item
- If glue/tape works as a binding (side, corner), treat it as a hard-to-remove fastener (see above)
Manuscript item in plastic sleeve
- If a good image can be captured through the sleeve, leave it on and shoot through it
- If a good image can't be captured through the sleeve...
- and the sleeve can't be removed or can't be removed without damaging the item, consult Jeremiah or a coworker about whether a capture is worth taking/keeping
- and the sleeve can be safelyremoved, it's okay to do so -- but ONLY if you can't get a good capture through the sleeve: the archivists put that sleeve on for a reason, after all
Interleaved material found in bound item
- If material is found between pages or clipped to pages
- scan it by itself against the background and
- number it in sequence.
- In the metadata for the bound item, include this in the Description field: "Interleaved materials were found in this item and scanned in place."
- For example, if a check stub is found after page 0057, it should be numbered 0058, and the facing page will be 0059.
- If this material is itself a bound item (such as a pamphlet or booklet)
- it should be given its own item number and
- added as a new line in the metadata; in the Description field, write: "This item was found within another item, Parent Item #, Title of Parent Item."
- For example, if an opera program is found inserted between pages 0042 and 0043 of item 0000034, it should be scanned separately and given the number 0000035, paginated starting at 0001, like any other item.
- it should be given its own item number and
Very large item
- Take test images to evaluate lighting and focus at edges of items; if okay, capture
- If too large for capture bed, consult with Jeremiah about how best to divide item over multiple images
Very fragile paper or binding
- If item can be scanned with extreme care, do it
- Handle with gloves
- Don't put under glass or otherwise compress
- Don't use flatbed scanner
- When in doubt, get a second opinion from Jeremiah or consult an archivist
- Keep these guidelines for handling material in mind
More than one right side up
- Use the most common orientation found in an item for all captures taken from a single item.
- If a book has plates that are rotated sideways, do not rotate those few pages to make them view right side up. leave them sideways so they match the format orientation of the rest of the pages in the book.
- If a letter has an address written on the back page that is turned sideways leave this capture sideways and do not reorient.
- If a letter has a separate envelope, please orient the envelope so its text is right side up
- We want to maintain the consistency of an items overall presentation when all of its pages are viewed side by side.
- place a sheet of tan card or folder stock paper underneath the translucent material being scanned.
- white paper can also be used if the material being digitized is a neutral gray
- In order to read text on a sheet of vellum that is bound up in a book, you must mask the text on any subsequent pages
- Black tends to be a bad choice because it lowers the contrast between the text and the paper of the page
- In general white or some other color paper is distracting because the hue of the vellum does not blend well with it.
Discrepancies with Metadata
Item not in box/folder
Make a note in the TrackingFiles. Obviously, you can't capture what's not there.
Item not in metadata: Manuscripts/auto-numbered by us
- First, do a "find" search in the Title column of the spreadsheet, looking for dates or names that appear in the item; this is to make sure that it's not already in the metadata and simply physically filed in the wrong place
- If you don't find any metadata for that item, capture it anyway, assigning it a number at the end of the file naming sequence for the collection
- check the metadata to confirm what that number would be
- check TrackingFiles as well as the web directory to make sure it hasn't been used
- Create a line of metadata for that item at the end of the spreadsheet
Item not in metadata: Photos/pre-numbered
- This item has been culled from the spreadsheet by April or someone else before it gets to us. Check to see if we're digitizing skipped items or not.
- If you do capture the item, insert the missing number into the metadata spreadsheet and do your best to extrapolate metadata from nearby lines.
Item has fewer/more pages than metadata indicates
- Fewer pages than metadata
- Check to see if anything is loose in the folder.
- If you can't determine where the missing page might be, scan what you do have, amend the page count in the Format column, and add a note to the Description column: "Item had pages missing."
- More pages than metadata
- Check object against metadata to make sure the page actually belongs to that item.
- If it belongs, correct the total in the spreadsheet's Format column.
The item has blank pages you didn't scan
Include a note in the metadata spreadsheet's Description field: "Item had blank pages that were not scanned."
Multiple items seem to constitute one object
- With manuscripts, trust the instinct of the archivist and assume there's a reason these items are not combined in the metadata and clipped together.
- With photos, this happens because they are always numbered individually during processing, and our filenames are based on those image numbers. For example, if a letter is found in a photo collection, each page is likely numbered as a separate item, which from our perspective is a mistake. If this happens
- assign the item number of the first page to the whole object, then amend the metadata line for the first page to reflect that this is a whole, not part of one -- in the Title, Description, and Format columns; and
- in TrackingFiles, explain why the remaining pre-assigned filenames have not been captured.
- Example: Photos 0000123, 0000124, and 0000125 are pictures of a single document, The Bradley Contract, and would not make sense unless seen in context together. Assign 0000123 as the item number and treat these images like pages in a multi-page document: place a folder for that item in the Scans folder, with pages 0000123_0001, 0000123_0002, and 0000123_0003 inside it. In the metadata, relabel metadata line one (Bradley Contract page 1) to The Bradley Contract. In the TrackingFiles beside 0000124 and 0000125, note that the images were part of a single document and scanned as 0000123.
Letter out of order: logical flow of letter does not match physical sequence of pages
- Normally, when a letter is written on folded paper, we're easily able to reproduce the text in order by following it around the physical object. For example, some old letters that were written on folded paper (picture something like a greeting card) when unfolded begin on the right side of the page, move to the back, then wrap around to the left side of the front page: 4 1 | 2 3. The digital item pagination will reflect this order: 0004 0001 | 0002 0003.
- If the flow of a letter's text does not match up in any straightforward way to its physical layout on the page, this is the rule of thumb: representing the object is more important than interpreting its content. If possible, begin with the first page, then scan the other pages in order. The image online, then, will approximate the researcher's experience of looking at the pages and trying to figure out how to follow the text.
- A letter might start on front right, jump to bottom of back right, continue on front left then back left, and end on top of back right: 3 1 | 4 5/2.
- This would actually be scanned just like the simple example above: 0004 0001 | 0002 0003. Note that this results in 4 scans for 5 text parts. It's confusing and space-wasting to render the back right page (5/2 or 0003) twice. The researcher should be able to sort this out.