- 1 University of Alabama Libraries Digital Services
University of Alabama Libraries Digital Services
Digital content is inherently fragile. It is easily corrupted, damaged, changed, or deleted.
Hence, access to important digital content must be controlled. If what we are protecting is the historical record, change to the original content must be prevented.
Even running a virus checker across content can change it. Opening a file can change it. Moving a file from one media to another can change it.
How do we protect and preserve our unique, fragile, historical documents?
How do we make them accessible, both today, and in the future? The whole point of preservation is support of long-term access.
Incoming digital content adds another layer of issues to these two questions:
- We may not know who or what has touched this content before we receive it, so we may be unable to guarantee its authenticity
- It is likely not yet in archival format, and may not be of archival quality
- It may not be in formats or on media with which we are familiar, or for which we have hardware or software
- It may contain information that needs to be redacted or controlled, due to intellectual property rights, copyright, privacy issues, computer viruses, or other issues
- It may contain information the donor did not intend for us to have
- We may have little or no information about the content.
Read more about our processes for incoming digital content here: Managing Incoming Digital Content
Incoming digital content which is nominally in archival format, but which is not valid and well-formed (as tested via JHOVE or other appropriate software) will not be retained. Valid, well-formed versions of these files will be retained instead as the preservation copy.
Preservation Plan for Digital Materials
The University of Alabama (UA) Libraries preserve selected digital content for long-term access support. Our highest level of attention and support is given to content selected for digitization from UA Libraries Special Collections. Other research materials are assigned preservation strategies at appropriate levels based on file formats and perceived needs of our designated audience, the faculty and students of the University of Alabama.
The University of Alabama (UA) Libraries DigiPres group will determine the need to normalize or migrate files pending loss of access due to obsolescence. Decisions will be made on a cost/benefit basis with consideration for the needs of our stated audience.
Division of Digital Content
- Level I support is for content digitized in formats and with methods supporting the current archival standards, and for which we have digital rights management permissions and documented access permission. This is our most dedicated level of support. It includes collection of technical and administrative metadata, bit-level preservation, and commitment to migrate content as formats change over the years.An example would be a manuscript collection digitized by Digital Services.
- Level II support is for content which may not have been digitized in currently supported archival formats, but for which The University of Alabama Libraries has committed long term access support, and for which we have digital rights permissions and documented access permissions. An example would be Electronic Theses and Dissertations.
- Level III support is for content which needs to undergo regular change, and hence is not appropriate for inclusion in LOCKSS; however, it is to our benefit to offer bit-level preservation for this content until it needs to change. An example of this would be software necessary for either migration or emulation.
- Level IV support is for content which may not have been digitized in currently supported archival formats, but for which The University of Alabama Libraries has committed short term access support, and for which we have digital rights permissions and documented access permissions. An example would be Undergraduate Research Papers.
- Level V support is for content for which The University of Alabama Libraries has not committed access support, but which is currently managed by Digital Services, and for which we have digital rights permissions. An example would be files digitized at the patron request.
|Support Level||Example||Committed to sustain access||Migration Support||Emulation Support||Long Term Retention||Bit-Level Preservation||Annual Review||Local Backups|
|Level I||Manuscript collection digitized by us||Yes||Yes||Yes||Yes||Yes||Yes||Yes|
|Level II||Electronic Theses and Dissertations||Possibly||Possibly||Possibly||Yes||Yes||Yes||Yes|
|Level III||Open source software for rendering archival content||No||No||No||No||Yes||Yes||Yes|
|Level IV||Undergraduate Research Papers||No||No||No||No||No||Yes||Yes|
|Level V||Material digitized at patron request||No||No||No||No||No||No||Yes|
Committed to sustain access
Every feasible effort will be made to continue access to this content. This may involve migration to new formats, or development and maintenance of emulation methods. This level of institutional commitment can only be made for content created in current archival format standards. Content not created in current archival standards is much likely to be migratable to new formats. However, if the content continues to be of value and either such migration is feasible and retains the significant properties of the content, or if emulation support is feasible, then continued access will be supported.
- Formats of archival files and versions of metadata will be stored on the top layer of the file system, in a flat text file exported regularly from the database where all entries to the storage system are entered and monitored regularly for format or metadata migration requirements.
- Descriptive, administrative, and provenance metadata will be stored in current schemas and formats in the file system as specified.
- Technical metadata will be extracted from archival files and formatted for storage into appropriate schemas (local profiles are currently under development for drawing from standards such as TextMD for text and AudioMD for audio). See Image Technical Metadata for profile and workflows.
- Open-source software which renders the current archival format, if available, will be stored in the archive. This will enable migration to newer file formats after the current ones become obsolete.
- A copy of an open-source operating system which supports the open-source software, if available and feasible, will be stored in the archive.
- Software and documentation necessary for emulation (recreation of the current user experience of our delivery system) will be stored in the archive.
- File system information which enables emulation of the operating system to support the file system will be stored with the content.
- In addition to the migration support above, open-source software needed for creating derivatives and providing web delivery may be stored in the archive.
- Documentation of current procedures for recreating the current online user experience may be stored in the archive.
- MD5 checksum scripts will run before each tape backup to verify content is not corrupt, and will notify the repository administrator of any errors. Backup copies of current checksums are stored on a separate server, and scripts on a third separate server verify checking scripts run as scheduled and without error.
- We are and will continue to be involved in LOCKSS or a similar preservation network, supporting at least 6 copies of the archival content across a geographically disbursed area. All archival content will be made available to this system.
- Prior to obsolescence, all content will be evaluated for preservation measures, which may involve either migration (reformatting) or emulation. Dependent upon their decisions and the availability of resources and viable migration/emulation methods, efforts will be made to continue accessibility. All preservation measures taken will be recorded.
- If continued accessibility is deemed unfeasible or advised against, online access will end, and stored content and metadata will be deleted.
- The definition of obsolescence used in these statements is that in which the approved computer systems and software on the University of Alabama Library computers can no longer render viable access to the content in the file without emulation services.
Long Term Retention
- Digital content will be named according to our file naming scheme and organized according to our file storage scheme on our storage server.
- The storage system is covered by a weekly full backup and daily differential backups. The weekly full backup is duplicated, and a copy is sent offsite, with at least a two-month rolling backup schedule.
- Up to 2 versions of descriptive metadata will be stored, the original and the most recent. Captures will be made quarterly from the delivery software web directories. If the metadata found there is more recent than what is stored, it will be placed in the archive. Version 2 of each metadata file will be overwritten with each new capture.
- MIX and FITS metadata are generated for all TIFF files. If the image is not valid or not well formed, or not a tiff, this will be versioned and the item marked for repair, as it makes no sense to archive anything not usable in the future. Some technical and administrative metadata is also logged in a database, such as format, format version, mime type, format registry, format registry key, whether conflicts were found, test date, and whether a thumb is included in the file.
FITS and format-appropriate technical metadata will be stored also for other types of content.
Our current preservation Network is [The Alabama Digital Preservation Network]
- Watching Our Backs (Database and other information)
- Organization of completed content for long-term storage
- File Naming and Linking for LOCKSS
10:57, 30 July 2014 (CDT)