Making MODS

From UA Libraries Digital Services Planning and Documentation
(Difference between revisions)
Jump to: navigation, search
 
(2 intermediate revisions by one user not shown)
Line 15: Line 15:
  
 
*Open metadata (.xlsx) in Excel and choose SAVE AS.  
 
*Open metadata (.xlsx) in Excel and choose SAVE AS.  
 
*Choose the "Text (Tab delimited)" option as shown if the metadata has no diacritics/ "foreign" characters in it.
 
**If there are diacritics in the metadata, refer to [[Diacritics]] before proceeding.
 
  
 
[[Image:MODS_1.png]]
 
[[Image:MODS_1.png]]
 +
 +
*Choose the "Text (Tab delimited)" option as shown if the metadata has no "foreign" characters (diacritics) in it or if you don't know if there are.
 +
**If you know there are diacritics in the metadata, go directly to [[Diacritics]] before proceeding.
  
 
*Save the .txt file in a prepared "Metadata" folder in the Digital_Coll_Complete directory.  
 
*Save the .txt file in a prepared "Metadata" folder in the Digital_Coll_Complete directory.  
  
 
*Open the .txt folder with your text editor.  
 
*Open the .txt folder with your text editor.  
**Do a find and replace to remove all the quotation marks that have appeared during the export
+
** Do a find and replace to remove all the quotation marks that have appeared during the export.
**Make sure the file is encoded as UTF-8 rather than ANSI or anything else. (In Notepad++, encoding will display in the bottom right corner of the screen. To change it, go to the Encoding menu and select Encode in UTF-8.)
+
** Check for [[Diacritics]] (unless you repaired them beforehand) and follow any applicable instructions before continuing
**Save the file.
+
** Make sure the file is encoded as UTF-8 rather than ANSI or anything else. (In Notepad++, encoding will display in the bottom right corner of the screen. To change it, go to the Encoding menu and select Encode in UTF-8.)
 +
** Save the file.
  
 
*Open Archivist Utility.  
 
*Open Archivist Utility.  
Line 41: Line 42:
 
*Choose the "Update Output" tab to check for errors.
 
*Choose the "Update Output" tab to check for errors.
  
*Choose the "Log" tab to check for an error report.
+
*Choose the "Log" tab to check for an error report (see below for explanations)
**If  you receive an error that reads "Invalid XML output, cannot tidy output," that means you have a diacritics issue (bad character) in your metadata text file. This is probably because some of the text in the spreadsheet was copied from Word. First, you can try to identify the character and correct it by hand. If that is not possible, try the steps found on the [[Diacritics]] page.
+
  
 
*If no errors, chose "Export All Rows" and place the MODS output in the prepared MODS folder in the Digital_Coll_Complete directory.
 
*If no errors, chose "Export All Rows" and place the MODS output in the prepared MODS folder in the Digital_Coll_Complete directory.
 +
 +
*Check the MODS folder, looking in the 'Size' column to confirm that the MODS files are not empty (0 KB).
 +
 +
==Error and Warning Messages==
 +
 +
===invalid XML output, could not tidy output===
 +
* this means you have an issue with a nonprinting character, almost always a diacritic
 +
* the Row # it gives you will be one row up from where the row lies on the spreadsheet (your spreadsheet has headers, but Archivist Utility starts counting at the row below the header)
 +
** example: ''Error (u0003_0000581.txt row 147): invalid XML output, could not tidy output'' refers to a problem with spreadsheet line 148
 +
 +
===Missing Columns and Unused Columns===
 +
 +
====are different====
 +
 +
This means you're probably using the wrong template
 +
 +
This is what it looks like when you put an old-form spreadsheet into a m0# template
 +
 +
[[image:old-ss-form_in_new-ss-template.jpg]]
 +
 +
This is what it looks like when you put a m0# spreadsheet into an old-form template
 +
 +
[[image:new-ss-form_in_old-ss-template.jpg]]
 +
 +
====are almost the same====
 +
 +
This means there's probably a misspelling in the the column header
 +
 +
Example: if you've got a Missing Column called "Date Created" and an Unused Column called "DateCreated," that's the same problem: the column was labeled incorrectly in the spreadsheet
 +
 +
===Cannot Update Output===
 +
This means that there is likely a diacritic that cannot be translated into XML
 +
 +
use notepad++ and view all characters and symbols to find the diacritic, refer to the diacritics page, replace, and attempt the making MODS process again.

Latest revision as of 12:23, 16 January 2014

Contents

[edit] Introduction

This page demonstrates how to create MODS files using Archivist Utility from an Excel metadata spreadsheet.

for information on how to upload MODS, see Uploading MODS


[edit] What You Need

  • Microsoft Excel
  • A solid text editor such as Notepad++ Portable (a Windows text editor) is freely available here.
  • Archivist Utility.


[edit] Steps

  • Open metadata (.xlsx) in Excel and choose SAVE AS.

MODS 1.png

  • Choose the "Text (Tab delimited)" option as shown if the metadata has no "foreign" characters (diacritics) in it or if you don't know if there are.
    • If you know there are diacritics in the metadata, go directly to Diacritics before proceeding.
  • Save the .txt file in a prepared "Metadata" folder in the Digital_Coll_Complete directory.
  • Open the .txt folder with your text editor.
    • Do a find and replace to remove all the quotation marks that have appeared during the export.
    • Check for Diacritics (unless you repaired them beforehand) and follow any applicable instructions before continuing
    • Make sure the file is encoded as UTF-8 rather than ANSI or anything else. (In Notepad++, encoding will display in the bottom right corner of the screen. To change it, go to the Encoding menu and select Encode in UTF-8.)
    • Save the file.
  • Open Archivist Utility.
  • Load the correct MODS template by choosing the "Load XML Template".
    • For "old" metadata spreadsheets, this template is MODS-template_7.ds.xml
    • For m01 metadata spreadsheets, this template is AU_template.m01_5.xml
    • Both of these templates can be found in S:\Digital Projects\Administrative\forSoftware\ArchivistUtility

MODS 2.png

  • Chose the "Load Data" tab and select your .txt file.
  • Choose the "Update Output" tab to check for errors.
  • Choose the "Log" tab to check for an error report (see below for explanations)
  • If no errors, chose "Export All Rows" and place the MODS output in the prepared MODS folder in the Digital_Coll_Complete directory.
  • Check the MODS folder, looking in the 'Size' column to confirm that the MODS files are not empty (0 KB).

[edit] Error and Warning Messages

[edit] invalid XML output, could not tidy output

  • this means you have an issue with a nonprinting character, almost always a diacritic
  • the Row # it gives you will be one row up from where the row lies on the spreadsheet (your spreadsheet has headers, but Archivist Utility starts counting at the row below the header)
    • example: Error (u0003_0000581.txt row 147): invalid XML output, could not tidy output refers to a problem with spreadsheet line 148

[edit] Missing Columns and Unused Columns

[edit] are different

This means you're probably using the wrong template

This is what it looks like when you put an old-form spreadsheet into a m0# template

Old-ss-form in new-ss-template.jpg

This is what it looks like when you put a m0# spreadsheet into an old-form template

New-ss-form in old-ss-template.jpg

[edit] are almost the same

This means there's probably a misspelling in the the column header

Example: if you've got a Missing Column called "Date Created" and an Unused Column called "DateCreated," that's the same problem: the column was labeled incorrectly in the spreadsheet

[edit] Cannot Update Output

This means that there is likely a diacritic that cannot be translated into XML

use notepad++ and view all characters and symbols to find the diacritic, refer to the diacritics page, replace, and attempt the making MODS process again.

Personal tools