Difference between revisions of "Descriptive metadata"

From UA Libraries Digital Services Planning and Documentation
(General Guidelines)
(General Guidelines)
Line 69: Line 69:
**Select a term from the Art and Architecture Thesaurus (AAT).
**Select a term from the Art and Architecture Thesaurus (AAT).
**Separate multiple entries with a semicolon.  
**Separate multiple entries with a semicolon.  
**For guidance on how to distinguish genre terms from subject headings, see [[Media:subject_vs_genre.PNG]]
**For guidance on how to distinguish genre terms from subject headings, see [[Media:subject_vs_genre.PNG | Subject vs. Genre]]
*  '''[[Descriptions| Descriptions]] '''
*  '''[[Descriptions| Descriptions]] '''
*  '''Dates:''' dates associated with a publication or creation of the material. Do not use square brackets such as [1856].
*  '''Dates:''' dates associated with a publication or creation of the material. Do not use square brackets such as [1856].

Revision as of 13:00, 10 February 2016

MODS, Metadata Object Description Schema, provides fields or elements that describes a resource for the purpose of resource discovery and identification. For example, if I asked you to tell me about a Siberian tiger, you might say that "it has orange and black stripes, has 4 legs and a tail, and lots of teeth." These would be placed into a physical format element. If this was going into a site about all types of animals, I might want to say that a tiger's type is carnivore. Subject terms provide analysis of the item, tiger. I would want to use specific terms so I would say "Siberian tiger."

To facilitate discovery controlled vocabulary lists are used. Controlled vocabulary lists support consistency for types and subjects by eliminating synonyms and alternative spellings. Authority lists for personal and corporate names support consistency by establishing the proper forms of these names to be used in the MODS records. These names are established by using a national standard, RDA. Their established form and alternative form if any exist are held in a local MADS (Metadata Authority Descriptive Schema) database. Title and description columns are "free text"

Read more about our item-level Current Descriptive Metadata and our Descriptive Metadata Workflows.

Collection-level descriptive metadata is captured in Collection_Information and EADs.


Creating collection xml file

For Item-Level Content:

  1. Ensure the spreadsheet input meets our input guidelines
  2. Use the getNames script to make sure each name (not in title or description) has a database number. (Note! no subject codes needed unless the name is in a subject!!)
  3. Use the getSubjects script to make sure all subjects are tagged
  4. Run the Excel Converter on the spreadsheet.
  5. Use the makeMods script to generate MODS (see Generating MODS)... and then BEFORE YOU MOVE THEM:
  6. Use XMLSpy and the manifest file (generated by the last script, that you’ll find next to your spreadsheet) to test and validate the MODS. (see Validating MODS)

MetadataGeneration Dec2015.png

General Guidelines

  • Expand abbreviations of personal names (for example: Wm. would be entered as William)
  • Do not use periods at the end of fields unless it is an accepted abbreviation, after a complete sentence in the Description column, or the ‘Staff Notes’ field, because that information is administrative metadata only, not to be used in the final metadata
  • Be consistent with your terminology in descriptions and titles; for instance, if you start using “railroad track” don’t switch over to “train track”
  • Don’t be afraid to take a second and look up information online, such as hunting down the county appropriate to a city or confirming the spelling of a proper noun (like the name of a well-known person or a place name)
  • Metadata for all collections are entered using one of three EXCEL spreadsheets or templates. m01 is used for manuscripts, photographs, receipts so it is used for most archival collections. m02 is used for notated music or sounds recording. m03 is used for materials that are continuous or part of a continuous publication. Each of these spreadsheets is to facilitate a large number of types of materials so you may not need all of the fields for one collection. You may freeze the top row to facilitate work with a super larger spreadsheet or you may hide columns or rows to facilitate data entry. By hiding fields, the template specifications can be made for one type of object (manuscripts, photographs, audio recordings, etc.). This is specially helpful when a collection only has one type of object. To determine this, you may want to consult the guide to the collection (finding aid or EAD record) or collection-level record in Acumen. For collections that contain a variety of types, it is best to use the template without any hidden fields. It is best practice to unhide all column and rows for proofing before creating MODS.
  • Each element has an obligation. They may be required, required if applicable, recommended, strongly recommended, and optional. Required fields are "required." "Required if applicable" is conditional. If it present, then it must be entered in the spreadsheet. "Subtitle" are "required if applicable." The Digital Collection location is "required" while other types of locations like "Box Numbers" and "Folder Number" are "required if applicable." It is best practice to enter all fields that have the following levels of obligation (Required, Required if applicable, Recommended). Subjects are very important to resource discovery, yet their obligation is "optional."
    • Required fields are:
    • Title
    • Date (Use one date column to enter a single date or a span of dates. Multiple date column may not be used.)
    • Format
    • Type
    • Genre
    • Language
    • Digital Collection
    • Filename (identifier)

  • Titles (Links to older materials that should be reviewed)
    • Punctuation should be removed at the end of name, title, subtitle, partNumber (unless ending in a question mark or an abbreviation)
    • For published items, use the published title as found on the title page. Punctuation and capitalization do not have to be entered as found.
    • Supply a title for unpublished items.
    • Every entry has to have a title (not “Untitled”)
  • Names see MADS and For_Subjects_and_Names
    • The same name may be entered into as many different "role" columns that apply to that object. Sheet music may have the same composer and lyricist so that name would be entered under each of column.
    • Don't enter the name under Creator when a better term is available on the spreadsheet.
    • There are times that you may want enter names "creator" column for built objects. John whittled that wooden elephant so you would want to enter "John" as the creator.
    • If creator is unknown, leave this area blank.
    • There are many MarcRelator roles that are not listed on the spreadsheets. These may be used by adding it after each creator name, prefaced by “#4” so the script will be able to pull them out. For example: Glover, F. L. #4 Compiler
  • Subjects see For_Subjects_and_Names#getSubjects.py
    • In the "Subject(s) LCSH" column you may enter appropriate Library of Congress Subject Headings; in the "Subject(s) TGM" column you may enter appropriate Thesaurus of Graphic Material subject headings (primarily for graphics and photos); and in the "Subject Location" column you may add Getty Thesaurus of Geographical Names if the content is primarily ABOUT a particular location.
    • Use the 80/20 rule: create subjects for what's covered by 80% of the content of large items. You may try to analyze letters by paragraphs. Subjects should be "about" the content of text materials. Assigning subject to graphic materials is telling users "what it is." For text or graphics, if you have more than 3 heading that fit under a broader heading, use the broader heading. So "cattle," "dog," "cat," you would use "mammals" (TGM). Apply this rule of three to geographical headings. For instrumental music, you would assign a musical form and instrumentation from existing LCSH headings or create following LC subject heading rules.
    • Copy sender and receiver names to LCSH column, followed by "--Correspondence" for a minimal level of subject headings. Names that are nicknames (Mom, Sister) and initials (T.), should be included in the subject field. (January 12, 2016)
    • If not significant, do not include names in subject headings.
    • If not significant material, do not spend more than 5 minutes assigning subjects.
    • If you are unfamiliar with the subject area or the definition of the subject heading, it is best not to assigned a subject heading than to lead a user astray.
    • See flow chart below to help you determine when to use which subject authority.

Subject authority flow chart.PNG

  • Genre
    • Genre term(s) designates a category characterizing a particular style, form, or content, such as an artistic, a musical, a literary composition, etc.
    • Select a term from the Art and Architecture Thesaurus (AAT).
    • Separate multiple entries with a semicolon.
    • For guidance on how to distinguish genre terms from subject headings, see Subject vs. Genre
  • Descriptions
  • Dates: dates associated with a publication or creation of the material. Do not use square brackets such as [1856].
    • Make sure that the date column is formatted to "general" and not a "date" format.
    • If no date can be found, determine the closest date and use circa or its abbreviation ca. or use a span of dates.
    • All dates should follow w3cdtf format
      • yyyy
      • yyyy-mm
      • yyyy-mm-dd
      • A span may represent a decade, a century, a span of months, and inferred dates such as “not before 1852” by using a beginning and end date (for example, a decade is represented by entering 1900-1909). A hyphen or a semicolon may be used with a date span.
    • Qualifiers may be used with single dates.
      • An entry for an approximate date may use "ca.," the abbreviation for circa
      • A questionable date is followed by a question mark.
  • Types: This is a controlled vocabulary established by the MODS standard and translates to the typeOfResource field in the MODS record. The only values that can be in that field are in this case-sensitive list:
    • text
    • cartographic
    • notated music
    • sound recording-musical
    • sound recording-nonmusical
    • sound recording
    • still image
    • moving image
    • three dimensional object
    • software, multimedia
    • mixed material
  • Born Digital or Reformatted? If the content is incoming digital (we did not digitize it from analog), please put “digital” in the Staff notes field, as this gets a different digitalOrigin value (“born digital” instead of “reformatted digital”)
  1. MODS Data Dictionary
  2. Input Guidelines -- check for more recent versions under the Metadata Creation heading. Here you will also find specific information for various types of materials, such as legal materials, receipts, railroad pamphlets, etc...

Older information: