Metadata & DS 7/26/16
1. EAD discrepancies while doing item level metadata description
- How do we communicate information to archivists?
- Spreadsheet template: S:\Digital Projects\Administrative\Templates\EAD_changeRequests.xslx
- Possible methods email or drop here: S:\Special Collections\Digital_Program_files\EAD\Feedback
- What kinds of things might we run across that is different from EAD?
- Add to the scope and content
- Correcting folder titles/ missing folders
- Transcripts script
Metadata & DS 7/19/16
1. Proposed Workshop Wednesday on using MARC for published materials
2. Cross training
- How should we bring up we don't know something?
- Corinne is available for training on the following:
- Audio digitization
- Upload and other scripts
- Jeremiah is available for training on the following:
- Digitization issues
- Technical difficulties
- Alissa is available for training on the following:
- Setting up collection directories and XML files
- April is available for training on the following:
- item-level processing
- EAD things
- Vanessa is available for training on the following:
- Mary is available for training on the following:
3. How we should name our spreadsheets
- As of 2011, there are 3 types of spreadsheets: m01 (Manuscript), m02 (Audio), m03 (Published)
- As of today we are keeping it
Metadata & DS 7/12/16
1) Geographic Names
- we will disambiguate rivers with the word "river".
2) Subject remediation
- if a subject is just really awful then a replacement subject maybe added to corrected to replace it. beware that all subjects that resemble it will also be replaced.
3) & vs. &
- amp; Jody is not sure if all scripts are handling ampersand correctly
- "and" is standard for LCSH. please use subjects that use "and" instead of "&"
4) Genre in subject field
- avoid redundancy when using the genre and use common sense
Metadata & DS 6/29/16
1) Highlights from Round Robin
- Jeremiah is developing equipment request for fall. It would take ~ 24k to bring the CaptureBack up to speed, so once again we are not planning to upgrade it.
- Alissa (with Claire's help) drafted changes to the DS internship and justification of each module; please review under S:\Digital Projects\Administrative\staffing this week.
- Mary & Vanessa are working on mapping/transforms for ETDs for DSpace; Jody has developed software to enable us to group by year/type and upload the content.
2) We reviewed the Shift Reporter and made several decisions:
- Remove Documentation at the category level and use "Other Tasks" then work type "Documentation"
- Meetings and Communication that are focused on Training and Learning or Production Support should be categorized there instead.
- Software development and troubleshooting goes under Production Support unless it's for Archiving & Preservation. Only use Research & Development for true R&D.
- Remove service as a task under Meetings & Communication, Material Exchange, and Training and Learning.
- Now that we are not using Shift Reporter to develop our monthly count, please DO INCLUDE estimate counts for optimization and QC.
3) We reviewed and approved the proposed goals for July.
4) The new Script Toolbox developed by DS staff is located in S:\Digital Projects\Administrative\scripts\toolbox.py -- as demo'd, it provides a single interface to numerous scripts, with links to more information about what each script does. This should reduce confusion and make it simple to find everything you need in one place. If you want other scripts added, notify the DS folks. Great work, folks, and terrific idea, Corinne!!
5) The archivists are being consulted as to who should take over 2 portions of April's work while she's out: item-level processing and EAD updates. Stay tuned... if it's folks on our end, we'll need to train before April is gone.
6) We're off on Monday for July 4th; the 5th is our celebration party. Bring snacks and games!
Metadata & DS 6/14/16
1) Wikifying Decisions
- The wiki will be updated with decisions made during departmental meeting under Documentation
- on a monthly basis, these decisions will be incorporated into the correct location on the wiki:
- Mary for Metadata and
- Jeremiah for DS
- Decisions made in person and via email must be added to wiki and communicated, so we're all on the same page
2) Restricted content
- Cannot currently be handled in Acumen. We'll need to locate the content in a different directory, and link to it from the metadata record
- We'll use PURLS as the location will likely change
- Mary will determine where in the MODS the link should go
- We'll have to modify the XSL to make the links clickable
- This will require a special upload process, instead of "relocate"
3) Library fair coming up
- Corinne, Claire and Alissa will think about how we can be more engaging and creative
- Jeremiah suggested we take pictures of people and put them in Acumen (Jody asked for rights agreement!)
- Some discussion about whether we should combine efforts with Digital Humanities, Special Collections or Will Jones.
- Jody will ask if we have a separate table, and whether we'll be combining with Week of Welcome.
4) Mockup for front of Acumen
- Corinne showed us 3 mockups; we preferred the one with 6 collection images.
- Corinne will share this with Will and ask to meet and discuss.
- There was some hope of retaining the existing graph, and creating an image map so clicking on parts of it would take the user to representative collections; Jeremiah will look into that and will ask Will if he could support that. If so, the graph could continue to display, but perhaps under the collection images.
- There was hope of getting rid of the "Beta" which may drive people away. Corinne will ask for this change (and will request inclusion in the Acumen-L list)
5) Spreadsheet check script review
- Corinne shared (again) a script that checks your exported spreadsheet to see if there are problems that could keep makeMods from generating the MODS without error. It's called spreadsheetCheck.pl and is located in Digital Projects\Administrative\scripts\Metadata.
- Right now it outputs errors to the command line. She'll modify it to write to an output file and email everyone when that is ready.
6) TGM is not required for topical subjects (reversal of earlier methodology mentioned in last week's meeting).
- That practice came out of cataloging and is not relevant for us.
- TGM is heavy on ampersands.
- To search TGM, use this site for TGM (Use "Search this Collection" not "search all")
7) Since Jody and Mary will be out next Tuesday, the group voted to skip meeting next week. Our next meeting will be on the 28th. Reminder: on July 5th, bring snacks and games to share. We'll be celebrating our 1-year anniversary as a department, all our accomplishments of this past year, and Alissa and Claire completing their 6-month reviews.
Metadata & DS 6/8/16
1) Overlapping collections in Acumen.
- Use Uniform titles so that metadata will enable online search-and-retrieval of related content in different collections.
- Emphasis (u0008_0000001) recordings were found in UA Reel-to-Reel Collection (u0008_2012038). Audio comes to us unprocessed, so the archivists may be unaware of this. Corinne will check with Donnelly and Marina to find out if these recordings are in the wrong collection. If so, Claire can add to the Emphasis spreadsheet she’s remediating.
2) Spreadsheet work feedback
- Title metadata coming straight from finding aids should not be changed (Item-level entries in image collection finding aids are extracted to spreadsheets to reduce duplication of effort).
- It’s okay to use the caption of an image as the title; if you do, write “Title from caption” in the description field.
- Ensure that titles are unique.
- Unpublished items don’t need the exact punctuation used on the analog title (exception: Roland Harper)
- Avoid redundancy in subjects, such as the location in multiple places
- Use organization names that coincide with the date of the item (names change over time)
- Avoid use of subjects for topics that are minimally covered by the item
- Use TGM (Thesaurus for Graphic Materials) for topics alone, not LCSH.
- When using names in subjects, however, use LCSH.
3) Subject Location
- Subject location tagging information is on the wiki under HierarchicalGeographic
- If Jody’s script that tags entries for this is of interest, someone can modify it to give it a GUI for use while working on spreadsheets
- To support subject locations that are NOT in TGN (Thesaurus of Geographic Names), we will add a column for local versions of subject locations. Mary will modify the spreadsheet templates, and Jody will modify makeMods to support this once Mary sends her the new spreadsheet headers for these columns.
4) Creator/Name Columns
In a previous meeting, we’d proposed removing all name columns except Creator(s) and then tagging names with #4 and the correct MARC Relator term. (The script refers to the list located in Administrative/scripts/Metadata/makeMods/ on the share drive. If we use other terms, they should be added.) This would reduce the number of columns in the spreadsheet. There does not seem to be a consensus yet on whether this is the way to go. Jody said the makeMods script supports either approach, and asked the group for their preferences. We’ll think about this and discuss again next week.
5) Subject Repair
Vanessa proposed last week that multiple volunteers try to repair 5 subjects a day, so we can make progress on this project. She reviewed the approach, to which we have added 2 columns.
- SearchOn (the first column) contains the subject in its current form
- Corrected (2nd column) contains a corrected form of the subject (if needed)
- TaggedValue (3rd column) contains the corrected form with appropriate subject tags.
- Authority (4th column) indicates the authority, usually LCSH or TGM.
- SecondaryAuthority (5th column) is in case the term is in multiple authorities, such as the term “dog”
- Collection (new 6th column) indicates what collection this is found in
- Items (new 7th column) indicates the comma-separated list of items, if this term is only used in a couple of files
These last 2 columns will make it possible for our scripts to update Acumen without having to go through every single MODS in Acumen to find the items for correction. We reviewed the lists (NOTE: PLEASE USE THE ONES in S:\Public\DigitalServices\ContentAnalysis\Subjects\WorkOnThese\!).
- Mary and Celeste are doing the Names_Subjects as they are remediating names.
- Vanessa is doing the Geography ones, and will pull the ProQuest ones out of the Subjects list.
- Alissa will take US & War
- Claire will work on UA
- Corinne will do A-C of the Subjects lists.
Please touch base with Vanessa weekly, and we’ll revisit this in a month or so to see how we’re doing.
6) Volunteer: Cyndi Woolsey, our intern from spring, is coming back to volunteer 1 day a week until she lands a job. She’ll meet with Corinne on Thursday to do paperwork and arrange her schedule.
7) As of July 1, Metadata & Digital Services will be 1 year old. We’ve come a long way: completely revamped the metadata workflow, built databases and new software, revised everyone’s work duties, and built a new team. Let’s celebrate our accomplishments! Our first meeting in July (on the 5th) will be a celebration. Please bring snacks and favorite games to share. We’ll invite Hoole folks too.
Metadata & DS 5/31/16
1) Round Robin highlights:
- Jeremiah has located a python software module that may enable us to modify current scripts to avoid (most? All?) exports. It will require a different version of python to be installed on our desktops. Stay tuned!
- Claire completed her 6 month review!!
- Crimson White:
- PDFs are being generated from our old content for Student Media
- They’re giving us copies of their PDFs 2004-present
- Jody’s developing Requests for Information to outsource microfilm to see if that’s viable
- Student Media proposes to disbind the volumes we still need to digitize and provide us with students for capture
- Jody will keep us updated
- Spreadsheet/database workflow project:
- This will do away with the Selection spreadsheet and Tracking Filenames, interacting instead with our database.
- Jody, Alissa and Claire are working on the back end
- Jeremiah will start building an interface for the archivists
- Tagging names in creator field:
- Corinne will test the tagging of names in the creator column. If it works out, we can simplify spreadsheets going forward.
- Mary will check the relator terms list for new audio spreadsheet terms.
- Be aware that if you use a relator term that we’ve not used before, the relator terms file in Administrative/scripts/Metadata/makeMods/ will need to be updated appropriately. Check with Mary or Vanessa or Jody.
- Subject location:
- Geographic locations (not part of a larger subject) that are currently placed in LCSH should be extracted & reformatted into TGN for subject location, and then tagged for HierarchicalGeographic.
- Celeste is reviewing a test of a script to automate most of the tagging; if it looks helpful, perhaps someone will modify it to make it easily usable for other spreadsheets.
- We may have to figure out how to support local subject locations as well as TGN. Suggestions on how to implement?
- Check with Mary or Vanessa if you have questions about what to put in this field, or what format it should be.
- Corinne is working with Kate and Will to develop a way for us to highlight different collections each month on the front page of Acumen
- Input Guidelines recommend the nickname be placed in parentheses following the first name. There was discussion about removing them from the MODS and only keeping them as synonyms in Acumen, but then when we change systems, it will not be easy to reconnect the nicknames with the original names. We decided to keep them in the MODS.
- Synonyms to be added to Acumen should be gathered and submitted no more often than every 3 months, due to the cost of time for implementation by the web programmer.
3) One-on-One metadata training:
- Mary would like to provide one-on-one metadata reviews and training for anyone creating or modifying metadata. After discussion, we decided every-other-week meetings might be best at first. We agreed to try this and report back on how helpful this is, and if there are suggestions that will improve this approach. Mary will set up the meetings.
4) 5 subjects a day:
- Remediation of the old subjects has fallen by the wayside; only Vanessa has been actively involved (apart from the names as subjects? That went to Mary). Vanessa suggested we try repairing 5 a day and Corinne and Claire volunteered.
5) Perkins Problem:
- Finding aid entries for Perkins Photos are problematic from 249-308, and potentially we are missing items 309-409. An illustrated letter is part of the problem. Jody asked for a volunteer to help sort out the issues, and finding none, is drafting Alissa to assist (since she was absent).
6) Wiki findability:
- Corinne pointed out that the search box is not terribly helpful. Jody suggested we build index pages for the links we use most, and that we coordinate. Either we have our own resource pages under Employee Resources with the links we use personally, or we can group them by type, such as metadata or digitization. Please coordinate…
7) Project Management Software:
- Nothing is cast in stone. We’re still exploring the extent to which project management software is useful. Jody hopes it will keep us from forgetting projects, and help track who’s doing what. It’s up to us at this point how much we need to use it, though at some point there will be some requirements for project reporting. Please review Asana and see what you think, if you have time.
8) Management style and team approach:
- Everyone brings different strengths and weaknesses to the table, different perspectives and ideas. We can accomplish more together than anyone can alone. Everyone is deserving of consideration and respect. Please share your expertise and your ideas for how we can improve, and seek to learn whatever will help us move forward as a team.
Metadata & DS 5/24/16
1) Names with nicknames: should be added to the MADS database like so, if not in VIAF: lastName, firstName (nickname)
2) XSL work: We need XSL modified to display hierarchicalGeographic tags and incorporate them into OAI. Corinne volunteered to help Vanessa, and Jody will support as needed.
3) What’s a batch? At this point, a batch reflects 6-800 scans, so limited because of the time spent in quality control during digitization. It is no longer constrained by metadata. Those remediating metadata do NOT need to separate spreadsheets into batches that match previous batches; but they DO need to name their spreadsheet exports with batch numbers that FOLLOW all existing batch numbers, to avoid overwriting what’s in the archive.
4) Columns vs. Tagging: After discussion, we agreed that if separating something out (subject parts, name roles, etc.) requires more than 2-4 columns in the spreadsheet, we’d rather tag the entries and simplify the spreadsheet.
5) Tagging names: (since we have multiple columns for names now, by role)
- Vanessa will update the relator terms list in Administrative/scripts/makeMods to include all name columns besides creator
- Tagging instructions will be updated in the wiki
- Jody will modify makeMods to support tagging of names, so they need only be entered in the Creator(s) column.
- Everyone will test
Potentially we can then simplify our spreadsheets moving forward (fewer columns) 6) Subject Locations, geographic subjects, and Roland Harper Photos
- Those creating/modifying metadata will start entering geographic subjects in the Subject Location(s) column in TGN format instead of in the LCSH Subjects column – and will tag them for HierarchicalGeographic, so they can be used in faceting.
- Jody will send Vanessa her current understanding of the difference between area and region, and she will research and share final decisions with the group, so we can be consistent in tagging
- Jody will modify a script used to test automation of geographic subjects into hierarchicalGeographic to generate modified subject Location entries for the Roland Harper Photos
- Claire, Celeste, and possibly others will review and correct these (to avoid having to do them all by hand)
- Claire has volunteered to help Celeste with the Roland Harper Photos (other volunteers welcome!!)
7) Logs: If the metadata is already completed: if digitizing a large collection or batch, please add the log information onto the metadata spreadsheet, then export when done. For small ones, just use the log template located in the Templates directory (S:\\Digital Projects\Administrative\Templates)
8) Metadata creators setup:
- Alissa and Claire will add a collection ID column to the Queue page of the Selection spreadsheet.
- Metadata creators moving entry from “in Process” to the Queue page will add a collection ID.
- Vanessa will start a wiki page on the process of collection setup, which Corinne will review (I hope) so we can have clear, agreed-upon instructions.
- Claire and Alissa will work on getting at least part of the database/setup script ready for use
- Claire will add instructions for use to the wiki and share with people
- Corinne will teach people about collection XML and collection setup. If you’re creating metadata and don’t know this: connect with Corinne.
9) Based on our discussion today, here’s my proposed goals for us for June:
- Complete Wade Hall Red Carpet Request
- Continue donor letter digitization Red Carpet Request
- Complete Wiggins Red Carpet Request
- Finalize workflow/database script in Perl
- Complete 6-month reviews for new staff