Schema.org provides a method for specifying on a web page what various bits of information mean, so that web crawlers can more effectively index the content for search and retrieval. Without this kind of markup, the crawlers have to "guess" which information is important and which is not, and cannot effectively index for specific queries.
Our metadata librarians have modified the mods.xsl to incorporate CreativeWork descriptive fields, for all of our item-level descriptions (MODS metadata).
We are hopeful that we can soon incorporate Schema.org encodings also into the ead.xsl files, for our collection-level records (EAD metadata).
Schema.org properties included in the web display include:
- name (for title)
- author (for creator)
- contentLocation (for sender location and also for recipient location)
- mentions (for recipient name)
- dateCreated (for creation date)
- about (for subjects)
- additionalType (for type and genre)
- inLanguage (for language of content)
This enables Google and other search engines to index the content in specific fields according to what those fields should mean to them. Thus, the title of the document will be indexed with other titles (known as "name" to Google); the subjects of the documents will be indexed with other subjects (known as "about" to Google).
To see what has been encoded, visit the Structured Data Testing Tool site and enter a URL for an item, but include "?_escaped_fragment_=", such as: