Meeting Notes

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
(Created page with "==Metadata & DS 6/14/16== *The wiki will be updated on a monthly basis based on decisions made at the meeting and documented here ** Mary for Metadata and Jeremiah for DS *...")
 
(Metadata & DS 6/14/16)
Line 6: Line 6:
 
*TGM is not required for topical subjects  
 
*TGM is not required for topical subjects  
 
**Use this site for TGM (Use Search this Collection not search all): http://www.loc.gov/pictures/collection/tgm/
 
**Use this site for TGM (Use Search this Collection not search all): http://www.loc.gov/pictures/collection/tgm/
 +
 +
==Metadata & DS 6/8/16 ==
 +
 +
1) Overlapping collections in Acumen. 
 +
* Use Uniform titles so that metadata will enable online search-and-retrieval of related content in different collections.
 +
* Emphasis (u0008_0000001) recordings were found in UA Reel-to-Reel Collection (u0008_2012038).  Audio comes to us unprocessed, so the archivists may be unaware of this.  Corinne will check with Donnelly and Marina to find out if these recordings are in the wrong collection.  If so, Claire can add to the Emphasis spreadsheet she’s remediating.
 +
2) Spreadsheet work feedback
 +
* Title metadata coming straight from finding aids should not be changed (Item-level entries in image collection finding aids are extracted to spreadsheets to reduce duplication of effort).
 +
* It’s okay to use the caption of an image as the title; if you do, write “Title from caption” in the description field.
 +
* Ensure that titles are unique.
 +
* Unpublished items don’t need the exact punctuation used on the analog title (exception: Roland Harper)
 +
* Avoid redundancy in subjects, such as the location in multiple places
 +
* Use organization names that coincide with the date of the item (names change over time)
 +
* Avoid use of subjects for topics that are minimally covered by the item
 +
* Use [http://www.loc.gov/pictures/collection/tgm/ TGM (Thesaurus for Graphic Materials)] for topics alone, not [http://id.loc.gov/authorities/subjects.html LCSH].
 +
* When using names in subjects, however, use LCSH.
 +
 +
3) Subject Location
 +
* Subject location tagging information is on the wiki under [[HierarchicalGeographic]]
 +
* If Jody’s script that tags entries for this is of interest, someone can modify it to give it a GUI for use while working on spreadsheets
 +
* To support subject locations that are NOT in [http://www.getty.edu/research/tools/vocabularies/tgn/ TGN (Thesaurus of Geographic Names)], we will add a column for local versions of subject locations.  Mary will modify the spreadsheet templates, and Jody will modify makeMods to support this once Mary sends her the new spreadsheet headers for these columns.
 +
 +
4) Creator/Name Columns
 +
 +
In a previous meeting, we’d proposed removing all name columns except Creator(s) and then [[For_Subjects_and_Names#getSubjects.py | tagging names with #4]] and the correct [https://www.loc.gov/marc/relators/relaterm.html MARC Relator term].  (The script refers to the list located in Administrative/scripts/Metadata/makeMods/ on the share drive.  If we use other terms, they should be added.)  This would reduce the number of columns in the spreadsheet.  There does not seem to be a consensus yet on whether this is the way to go.  Jody said the makeMods script supports either approach, and asked the group for their preferences.  We’ll think about this and discuss again next week.
 +
 +
5) Subject Repair
 +
 +
Vanessa proposed last week that multiple volunteers try to repair 5 subjects a day, so we can make progress on this project.  She reviewed the approach, to which we have added 2 columns.
 +
* SearchOn (the first column) contains the subject in its current form
 +
* Corrected (2nd column) contains a corrected form of the subject (if needed)
 +
* TaggedValue (3rd column) contains the corrected form with [[For_Subjects_and_Names#getSubjects.py | appropriate subject tags]].
 +
* Authority (4th column) indicates the authority, usually LCSH or TGM.
 +
* SecondaryAuthority (5th column) is in case the term is in multiple authorities, such as the term “dog”
 +
* Collection (new 6th column) indicates what collection this is found in
 +
* Items (new 7th column) indicates the comma-separated list of items, if this term is only used in a couple of files
 +
 +
These last 2 columns will make it possible for our scripts to update Acumen without having to go through every single MODS in Acumen to find the items for correction.
 +
We reviewed the lists <font color="red">(NOTE:  PLEASE USE THE ONES  in S:\Public\DigitalServices\ContentAnalysis\Subjects\WorkOnThese\!)</font>. 
 +
 +
* Mary and Celeste are doing the Names_Subjects as they are remediating names.
 +
* Vanessa is doing the Geography ones, and will pull the ProQuest ones out of the Subjects list. 
 +
* Alissa will take US & War
 +
* Claire will work on UA
 +
* Corinne will do A-C of the Subjects lists. 
 +
 +
Please touch base with Vanessa weekly, and we’ll revisit this in a month or so to see how we’re doing.
 +
 +
6) Volunteer:  Cyndi Woolsey, our intern from spring, is coming back to volunteer 1 day a week until she lands a job.  She’ll meet with Corinne on Thursday to do paperwork and arrange her schedule.
 +
 +
7) As of July 1, Metadata & Digital Services will be 1 year old.  We’ve come a long way:  completely revamped the metadata workflow, built databases and new software, revised everyone’s work duties, and built a new team.  Let’s celebrate our accomplishments!  Our first meeting in July (on the 5th) will be a celebration.  Please bring snacks and favorite games to share.  We’ll invite Hoole folks too.
 +
 +
 +
==Metadata & DS 5/31/16 ==
 +
 +
 +
• Round Robin highlights:
 +
o Jeremiah has located a python software module that may enable us to modify current scripts to avoid (most? All?) exports.  It will require a different version of python to be installed on our desktops. Stay tuned!
 +
o Claire completed her 6 month review!!
 +
• Updates:
 +
o Crimson White:
 +
 PDFs are being generated from our old content for Student Media
 +
 They’re giving us copies of their PDFs 2004-present
 +
 Jody’s developing Requests for Information to outsource microfilm to see if that’s viable
 +
 Student Media proposes to disbind the volumes we still need to digitize and provide us with students for capture
 +
 Jody will keep us updated
 +
o Spreadsheet/database workflow project:
 +
 This will do away with the Selection spreadsheet and Tracking Filenames, interacting instead with our database.
 +
 Jody, Alissa and Claire are working on the back end
 +
 Jeremiah will start building an interface for the archivists
 +
o Tagging names in creator field:
 +
 Corinne will test the tagging of names in the creator column.  If it works out, we can simplify spreadsheets going forward.
 +
 Mary will check the relator terms list for new audio spreadsheet terms. 
 +
 Be aware that if you use a relator term that we’ve not used before, the relator terms file in Administrative/scripts/Metadata/makeMods/  will need to be updated appropriately.  Check with Mary or Vanessa or Jody.
 +
o Subject location:
 +
 Geographic locations (not part of a larger subject) that are currently placed in LCSH should be extracted & reformatted into TGN for subject location, and then tagged for [[HierarchicalGeographic]].
 +
 Celeste is reviewing a test of a script to automate most of the tagging;  if it looks helpful, perhaps someone will modify it to make it easily usable for other spreadsheets.
 +
 We may have to figure out how to support local subject locations as well as TGN.  Suggestions on how to implement?
 +
 Check with Mary or Vanessa if you have questions about what to put in this field, or what format it should be.
 +
o Corinne is working with Kate and Will to develop a way for us to highlight different collections each month on the front page of Acumen
 +
• Nicknames:
 +
o Input Guidelines recommend the nickname be placed in parentheses following the first name.  There was discussion about removing them from the MODS and only keeping them as synonyms in Acumen, but then when we change systems, it will not be easy to reconnect the nicknames with the original names.  We decided to keep them in the MODS.
 +
o Synonyms to be added to Acumen should be gathered and submitted no more often than every 3 months, due to the cost of time for implementation by the web programmer.
 +
o One-on-One metadata training:
 +
o Mary would like to provide one-on-one metadata reviews and training for anyone creating or modifying metadata. After discussion, we decided every-other-week meetings might be best at first.  We agreed to try this and report back on how helpful this is, and if there are suggestions that will improve this approach. Mary will set up the meetings.
 +
• 5 subjects a day:
 +
o Remediation of the old subjects has fallen by the wayside; only Vanessa has been actively involved (apart from the names as subjects? That went to Mary).  Vanessa suggested we try repairing 5 a day and Corinne and Claire volunteered.
 +
• Perkins Problem:
 +
o Finding aid entries for Perkins Photos are problematic from 249-308, and potentially we are missing items 309-409.  An illustrated letter is part of the problem.  Jody asked for a volunteer to help sort out the issues, and finding none, is drafting Alissa to assist (since she was absent).  
 +
• Wiki findability:
 +
o Corinne pointed out that the search box is not terribly helpful.  Jody suggested we build index pages for the links we use most, and that we coordinate.  Either we have our own resource pages under Employee Resources with the links we use personally, or we can group them by type, such as metadata or digitization.  Please coordinate…
 +
• Project Management Software:
 +
o Nothing is cast in stone.  We’re still exploring the extent to which project management software is useful.  Jody hopes it will keep us from forgetting projects, and help track who’s doing what.  It’s up to us at this point how much we need to use it, though at some point there will be some requirements for project reporting.  Please review Asana and see what you think, if you have time.
 +
• Management style and team approach:
 +
o Everyone brings different strengths and weaknesses to the table, different perspectives and ideas.  We can accomplish more together than anyone can alone. Everyone is deserving of consideration and respect.  Please share your expertise and your ideas for how we can improve, and seek to learn whatever will help us move forward as a team.
 +
 +
==Metadata & DS 5/24/16 ==
 +
 +
1) Names with nicknames: should be added to the MADS database like so, if not in VIAF:  lastName, firstName (nickname)
 +
2) XSL work:  We need XSL modified to display hierarchicalGeographic tags and incorporate them into OAI.  Corinne volunteered to help Vanessa, and Jody will support as needed.
 +
3) What’s a batch?  At this point, a batch reflects 6-800 scans, so limited because of the time spent in quality control during digitization.  It is no longer constrained by metadata.  Those remediating metadata do NOT need to separate spreadsheets into batches that match previous batches;  but they DO need to name their spreadsheet exports with batch numbers that FOLLOW all existing batch numbers, to avoid overwriting what’s in the archive.
 +
4) Columns vs. Tagging: After discussion, we agreed that if separating something out (subject parts, name roles, etc.) requires more than 2-4 columns in the spreadsheet, we’d rather tag the entries and simplify the spreadsheet.
 +
5) Tagging names:  (since we have multiple columns for names now, by role)
 +
• Vanessa will update the relator terms list in Administrative/scripts/makeMods to include all name columns besides creator
 +
• Tagging instructions will be  [[For_Subjects_and_Names#getSubjects.py | updated in the wiki]]
 +
• Jody will modify makeMods to support tagging of names, so they need only be entered in the Creator(s) column.
 +
• Everyone will test
 +
• Potentially we can then simplify our spreadsheets moving forward (fewer columns)
 +
6) Subject Locations, geographic subjects, and Roland Harper Photos
 +
• Those creating/modifying metadata will start entering geographic subjects in the Subject Location(s) column in [http://www.getty.edu/research/tools/vocabularies/tgn/ TGN] format instead of in the LCSH Subjects column – and will tag them for [[HierarchicalGeographic]], so they can be used in faceting. 
 +
• Jody will send Vanessa her current understanding of the difference between area and region, and she will research and share final decisions with the group, so we can be consistent in tagging
 +
• Jody will modify a script used to test automation of geographic subjects into hierarchicalGeographic to generate modified subject Location entries for the Roland Harper Photos
 +
• Claire, Celeste, and possibly others will review and correct these (to avoid having to do them all by hand)
 +
• Claire has volunteered to help Celeste with the Roland Harper Photos (other volunteers welcome!!)
 +
7) Logs:  If the metadata is already completed:  if digitizing a large collection or batch, please add the log information onto the metadata spreadsheet, then export when done.  For small ones, just use the log template located in the Templates directory (S:\\Digital Projects\Administrative\Templates)
 +
8) Metadata creators setup: 
 +
• Alissa and Claire will add a collection ID column to the Queue page of the Selection spreadsheet.
 +
• Metadata creators moving entry from “in Process” to the Queue page will add a collection ID.
 +
• Vanessa will start a wiki page on the process of collection setup, which Corinne will review (I hope) so we can have clear, agreed-upon instructions.
 +
• Claire and Alissa will work on getting at least part of the database/setup script ready for use
 +
• Claire will add instructions for use to the wiki and share with people
 +
• Corinne will teach people about collection XML and collection setup.  If you’re creating metadata and don’t know this:  connect with Corinne.
 +
 +
Based on our discussion today, here’s my proposed goals for us for June:
 +
• Complete Wade Hall Red Carpet Request
 +
• Continue donor letter digitization Red Carpet Request
 +
• Complete Wiggins Red Carpet Request
 +
• Finalize workflow/database script in Perl
 +
• Complete 6-month reviews for new staff

Revision as of 13:41, 14 June 2016

Contents

Metadata & DS 6/14/16

  • The wiki will be updated on a monthly basis based on decisions made at the meeting and documented here
    • Mary for Metadata and Jeremiah for DS

Metadata & DS 6/8/16

1) Overlapping collections in Acumen.

  • Use Uniform titles so that metadata will enable online search-and-retrieval of related content in different collections.
  • Emphasis (u0008_0000001) recordings were found in UA Reel-to-Reel Collection (u0008_2012038). Audio comes to us unprocessed, so the archivists may be unaware of this. Corinne will check with Donnelly and Marina to find out if these recordings are in the wrong collection. If so, Claire can add to the Emphasis spreadsheet she’s remediating.

2) Spreadsheet work feedback

  • Title metadata coming straight from finding aids should not be changed (Item-level entries in image collection finding aids are extracted to spreadsheets to reduce duplication of effort).
  • It’s okay to use the caption of an image as the title; if you do, write “Title from caption” in the description field.
  • Ensure that titles are unique.
  • Unpublished items don’t need the exact punctuation used on the analog title (exception: Roland Harper)
  • Avoid redundancy in subjects, such as the location in multiple places
  • Use organization names that coincide with the date of the item (names change over time)
  • Avoid use of subjects for topics that are minimally covered by the item
  • Use TGM (Thesaurus for Graphic Materials) for topics alone, not LCSH.
  • When using names in subjects, however, use LCSH.

3) Subject Location

  • Subject location tagging information is on the wiki under HierarchicalGeographic
  • If Jody’s script that tags entries for this is of interest, someone can modify it to give it a GUI for use while working on spreadsheets
  • To support subject locations that are NOT in TGN (Thesaurus of Geographic Names), we will add a column for local versions of subject locations. Mary will modify the spreadsheet templates, and Jody will modify makeMods to support this once Mary sends her the new spreadsheet headers for these columns.

4) Creator/Name Columns

In a previous meeting, we’d proposed removing all name columns except Creator(s) and then tagging names with #4 and the correct MARC Relator term. (The script refers to the list located in Administrative/scripts/Metadata/makeMods/ on the share drive. If we use other terms, they should be added.) This would reduce the number of columns in the spreadsheet. There does not seem to be a consensus yet on whether this is the way to go. Jody said the makeMods script supports either approach, and asked the group for their preferences. We’ll think about this and discuss again next week.

5) Subject Repair

Vanessa proposed last week that multiple volunteers try to repair 5 subjects a day, so we can make progress on this project. She reviewed the approach, to which we have added 2 columns.

  • SearchOn (the first column) contains the subject in its current form
  • Corrected (2nd column) contains a corrected form of the subject (if needed)
  • TaggedValue (3rd column) contains the corrected form with appropriate subject tags.
  • Authority (4th column) indicates the authority, usually LCSH or TGM.
  • SecondaryAuthority (5th column) is in case the term is in multiple authorities, such as the term “dog”
  • Collection (new 6th column) indicates what collection this is found in
  • Items (new 7th column) indicates the comma-separated list of items, if this term is only used in a couple of files

These last 2 columns will make it possible for our scripts to update Acumen without having to go through every single MODS in Acumen to find the items for correction. We reviewed the lists (NOTE: PLEASE USE THE ONES in S:\Public\DigitalServices\ContentAnalysis\Subjects\WorkOnThese\!).

  • Mary and Celeste are doing the Names_Subjects as they are remediating names.
  • Vanessa is doing the Geography ones, and will pull the ProQuest ones out of the Subjects list.
  • Alissa will take US & War
  • Claire will work on UA
  • Corinne will do A-C of the Subjects lists.

Please touch base with Vanessa weekly, and we’ll revisit this in a month or so to see how we’re doing.

6) Volunteer: Cyndi Woolsey, our intern from spring, is coming back to volunteer 1 day a week until she lands a job. She’ll meet with Corinne on Thursday to do paperwork and arrange her schedule.

7) As of July 1, Metadata & Digital Services will be 1 year old. We’ve come a long way: completely revamped the metadata workflow, built databases and new software, revised everyone’s work duties, and built a new team. Let’s celebrate our accomplishments! Our first meeting in July (on the 5th) will be a celebration. Please bring snacks and favorite games to share. We’ll invite Hoole folks too.


Metadata & DS 5/31/16

• Round Robin highlights: o Jeremiah has located a python software module that may enable us to modify current scripts to avoid (most? All?) exports. It will require a different version of python to be installed on our desktops. Stay tuned! o Claire completed her 6 month review!! • Updates: o Crimson White:  PDFs are being generated from our old content for Student Media  They’re giving us copies of their PDFs 2004-present  Jody’s developing Requests for Information to outsource microfilm to see if that’s viable  Student Media proposes to disbind the volumes we still need to digitize and provide us with students for capture  Jody will keep us updated o Spreadsheet/database workflow project:  This will do away with the Selection spreadsheet and Tracking Filenames, interacting instead with our database.  Jody, Alissa and Claire are working on the back end  Jeremiah will start building an interface for the archivists o Tagging names in creator field:  Corinne will test the tagging of names in the creator column. If it works out, we can simplify spreadsheets going forward.  Mary will check the relator terms list for new audio spreadsheet terms.  Be aware that if you use a relator term that we’ve not used before, the relator terms file in Administrative/scripts/Metadata/makeMods/ will need to be updated appropriately. Check with Mary or Vanessa or Jody. o Subject location:  Geographic locations (not part of a larger subject) that are currently placed in LCSH should be extracted & reformatted into TGN for subject location, and then tagged for HierarchicalGeographic.  Celeste is reviewing a test of a script to automate most of the tagging; if it looks helpful, perhaps someone will modify it to make it easily usable for other spreadsheets.  We may have to figure out how to support local subject locations as well as TGN. Suggestions on how to implement?  Check with Mary or Vanessa if you have questions about what to put in this field, or what format it should be. o Corinne is working with Kate and Will to develop a way for us to highlight different collections each month on the front page of Acumen • Nicknames: o Input Guidelines recommend the nickname be placed in parentheses following the first name. There was discussion about removing them from the MODS and only keeping them as synonyms in Acumen, but then when we change systems, it will not be easy to reconnect the nicknames with the original names. We decided to keep them in the MODS. o Synonyms to be added to Acumen should be gathered and submitted no more often than every 3 months, due to the cost of time for implementation by the web programmer. o One-on-One metadata training: o Mary would like to provide one-on-one metadata reviews and training for anyone creating or modifying metadata. After discussion, we decided every-other-week meetings might be best at first. We agreed to try this and report back on how helpful this is, and if there are suggestions that will improve this approach. Mary will set up the meetings. • 5 subjects a day: o Remediation of the old subjects has fallen by the wayside; only Vanessa has been actively involved (apart from the names as subjects? That went to Mary). Vanessa suggested we try repairing 5 a day and Corinne and Claire volunteered. • Perkins Problem: o Finding aid entries for Perkins Photos are problematic from 249-308, and potentially we are missing items 309-409. An illustrated letter is part of the problem. Jody asked for a volunteer to help sort out the issues, and finding none, is drafting Alissa to assist (since she was absent).  • Wiki findability: o Corinne pointed out that the search box is not terribly helpful. Jody suggested we build index pages for the links we use most, and that we coordinate. Either we have our own resource pages under Employee Resources with the links we use personally, or we can group them by type, such as metadata or digitization. Please coordinate… • Project Management Software: o Nothing is cast in stone. We’re still exploring the extent to which project management software is useful. Jody hopes it will keep us from forgetting projects, and help track who’s doing what. It’s up to us at this point how much we need to use it, though at some point there will be some requirements for project reporting. Please review Asana and see what you think, if you have time. • Management style and team approach: o Everyone brings different strengths and weaknesses to the table, different perspectives and ideas. We can accomplish more together than anyone can alone. Everyone is deserving of consideration and respect. Please share your expertise and your ideas for how we can improve, and seek to learn whatever will help us move forward as a team.

Metadata & DS 5/24/16

1) Names with nicknames: should be added to the MADS database like so, if not in VIAF: lastName, firstName (nickname) 2) XSL work: We need XSL modified to display hierarchicalGeographic tags and incorporate them into OAI. Corinne volunteered to help Vanessa, and Jody will support as needed. 3) What’s a batch? At this point, a batch reflects 6-800 scans, so limited because of the time spent in quality control during digitization. It is no longer constrained by metadata. Those remediating metadata do NOT need to separate spreadsheets into batches that match previous batches; but they DO need to name their spreadsheet exports with batch numbers that FOLLOW all existing batch numbers, to avoid overwriting what’s in the archive. 4) Columns vs. Tagging: After discussion, we agreed that if separating something out (subject parts, name roles, etc.) requires more than 2-4 columns in the spreadsheet, we’d rather tag the entries and simplify the spreadsheet. 5) Tagging names: (since we have multiple columns for names now, by role) • Vanessa will update the relator terms list in Administrative/scripts/makeMods to include all name columns besides creator • Tagging instructions will be updated in the wiki • Jody will modify makeMods to support tagging of names, so they need only be entered in the Creator(s) column. • Everyone will test • Potentially we can then simplify our spreadsheets moving forward (fewer columns) 6) Subject Locations, geographic subjects, and Roland Harper Photos • Those creating/modifying metadata will start entering geographic subjects in the Subject Location(s) column in TGN format instead of in the LCSH Subjects column – and will tag them for HierarchicalGeographic, so they can be used in faceting. • Jody will send Vanessa her current understanding of the difference between area and region, and she will research and share final decisions with the group, so we can be consistent in tagging • Jody will modify a script used to test automation of geographic subjects into hierarchicalGeographic to generate modified subject Location entries for the Roland Harper Photos • Claire, Celeste, and possibly others will review and correct these (to avoid having to do them all by hand) • Claire has volunteered to help Celeste with the Roland Harper Photos (other volunteers welcome!!) 7) Logs: If the metadata is already completed: if digitizing a large collection or batch, please add the log information onto the metadata spreadsheet, then export when done. For small ones, just use the log template located in the Templates directory (S:\\Digital Projects\Administrative\Templates) 8) Metadata creators setup: • Alissa and Claire will add a collection ID column to the Queue page of the Selection spreadsheet. • Metadata creators moving entry from “in Process” to the Queue page will add a collection ID. • Vanessa will start a wiki page on the process of collection setup, which Corinne will review (I hope) so we can have clear, agreed-upon instructions. • Claire and Alissa will work on getting at least part of the database/setup script ready for use • Claire will add instructions for use to the wiki and share with people • Corinne will teach people about collection XML and collection setup. If you’re creating metadata and don’t know this: connect with Corinne.

Based on our discussion today, here’s my proposed goals for us for June: • Complete Wade Hall Red Carpet Request • Continue donor letter digitization Red Carpet Request • Complete Wiggins Red Carpet Request • Finalize workflow/database script in Perl • Complete 6-month reviews for new staff

Personal tools