4.1 Traditional Preservation and the Importance of Media
The classic paradigm for library preservation features media that will endure for long time spans. Libraries have sought to persuade publishers to print on acid-free papers so that physical books will last for centuries. Library preservation programs have copied materials onto microfilm of a particular physical type in order to create a human readable copy that aging tests suggest will last for more than 100 years. In the face of this paradigm, it is no wonder that the discussion of electronic reformatting has produced concern within the library preservation community.
4.2 Electronic Reformatting of Audio and Video
Electronic formatting has been an inescapable part of audio and video preservation programs for several decades and, in this arena, there is no such thing as a human-readable copy. Deteriorating or endangered sound and moving image recordings have been copied from one tape to another in order to keep the underlying content alive through time. In the face of the seemingly short life spans of tape media, preservation professionals have examined various optical disk media with interest, only to discover that the longer life span of the media may not be matched by the life span of the hardware and systems required to recover the stored content. The experience of reformatting in this general context has led specialists in the field to seek alternate models for reformatting and preservation copying.
The Library assessed the preservation of video materials throughout the nation and published its findings in A Study of the Current State of American Television and Video Preservation (Library of Congress: Washington, 1997; ISBN: 0-8444-0946-4). In another Library of Congress study, the consultant William D. Storm outlined ways in which the migration of audio and video materials into a computer-data environment could address the problems associated with conventional reformatting of these original formats. Storm's report is titled Unified Strategy for the Preservation of Audio and Video Materials (Preservation Research and Testing Series No. 9806; Aug. 1997; rev. Nov. 1998), available from:
Research and Testing Office
Preservation Directorate
Library of Congress
101 Independence Ave. SE
Washington DC 20540-4560.
4.3 The Preservation of Digital Information
The paradigm for the preservation of computer information has a very low dependency on
media.
Of course, each computer file must be stored on media of some kind. But the computer
specialist's assumptions include the idea of short-lived systems and media and the correlative
idea
that content must be migrated to new systems and media in a routine and systematic
manner. The
computer specialist also understands that some complex data forms may not be migratable and,
in
order to continue their use, the obsolescent or obsolete systems or environments in which they
function must be emulated.
The family of thorny issues surrounding the preservation of digital information has received
considerable comment in recent years. One statement (which cites other relevant work) is Jeff
Rothenberg's Avoiding Technological Quicksand: Finding a Viable Technical Foundation
for
Digital Preservation (report to the Council on Library and Information Science, January
1999,
ISBN 1-887334-63-7). Information about this and other reports may be found at the Council on Library and Information Resources (CLIR) website.
At this time, the Library of Congress is not seeking digital solutions for all of its preservation
work. Interest is high, however, in promising areas like printed matter reformatting and--relevant
here--magnetically recorded audio and video. Many tape media and formats traditionally used
for audio and video reformatting activities are no longer manufactured. The condition of tape
recordings from the 1960s and 1970s has reached a near-crisis state. At the same time, the
development of digital technologies for a variety of activities, including broadcasting, suggest
that
the current environment is conducive to the adoption of computer-based digital reformatting.
Thus the Library wishes to explore the computer-data preservation paradigm as it may apply to
the preservation of the digital content that emerges in these areas and as may also emerge from
projects whose goal is to broaden access to collections. Can this paradigm be refined to a point
that inspires confidence in the library community?
It is worth noting that the Library anticipates carrying out a three-pronged approach to audio
and
video preservation, at least until there is a high level of confidence in computer-style
preservation.
This triple approach is that (1) the original items will be retained, properly housed, and stored in
suitable environment, (2) conventional "tape-to-tape" copies will continue to be produced, often
in analog form, and (3) new computer-digital copies will be made in the manner outlined in this
document.
4.4 Preservation and the Prototyping Project
The Library wishes to take advantage of the Prototyping Project to learn more about
constructing
a computer-data preservation paradigm. The Library understands that the realization of this
paradigm in its fullest form will await a multi-year, multi-institution process. The Prototyping
Project will explore as many of the following elements as can reasonably be done within the
project's compass. The elements include the following:
The term persistent archive is taken from a talk delivered at the Library of
Congress on June 3,
1999, by Reagan Moore from the San Diego Supercomputer Center and the National Partnership
for Advanced Computational Infrastructure (NPACI) at the University of California, San Diego.
The acronym DICE stands for Data Intensive Computing Environments.
Moore said that his team faced the challenge of finding ways to maintain digital data for
hundreds
of years in the face of system changes, articulating the challenge in these words: "the technology
to instantiate data changes every three years, the technology for data presentation changes every
four years, and the technology to archive the collection changes every five years." A set of slides (also in PDF format) that illustrated
Moore's talk has been made accessible on the WWW.
Related home page: http://www.npaci.edu/DICE/
Address:
Related document: Report
on Collection Based Persistent Archives
Moore's description of a persistent archive or repository includes the following key features:
4.4.2 Backing Up Repository Content
Most computer backup systems copy what is on disk to tape and sometimes other
media. Generally speaking, the format of what is on disk is in the form of
files. In the current CNRI
repository (see Attachment 2), the system relies on typical backup
systems to make copies of the files it contains (both digital objects and repository software).
These copies are on disk to tape media.
Thus the repository saves its "state" on appropriate media. If the repository were
catastrophically
lost one could restore from the media and have the state as of the last backup, just like most
software systems. This would include both the repository software and the digital objects that
are
stored in it. The restoration would take place in an environment that is the same environment as
before, using the same repository software. Backup data in this typical scheme is critically
important but is not data that can easily be moved or migrated to a different environment or
system.
4.4.3 Archiving Digital Content
To archive digital content is to produce a copy that is capable of being migrated
to a new system
or environment, as well as a copy that is capable of being refreshed, e.g., as one nears the end of
the life-span of the media upon which the archival copy is recorded. With large digital stores,
the
rate of data transfer may be an issue: if an obsolete system is failing, is there enough time to
"re-archive" all of its content to a new system? What is needed is an approach to the life-cycle
management of digital information.
There have been a variety of informal statements within the digital library community
concerning
the archiving of digital content. For example, staff at the University of California, Berkeley,
have
used archival repository as a contrasting term to access repository. The
former is designed to
preserve the objects it contains while the latter is structured to facilitate
access to or the
presentation of the objects it contains. Other specialists have referred to archival digital
objects
in contradistinction to digital objects, with the same intent as the distinction between
archival and
access repositories.
The use of the word preservation in the name Universal Preservation
Format suggests the UPF
group's interest in archiving. In their document titled Universal Preservation Format: Part 1: User
Requirements, Thom Shepard and Dave MacCarn offer a very rich description of an
archival
object: The Prototyping Project provides the Library with an opportunity to construct an archival
object,
albeit probably not one as rich as the UPF object. The options include at least the following:
Although media in this context does not have the same vital significance as in
the
classic
preservation paradigm--it need not last for decades--it is still important. One might consider the
media used for digital archiving as a "holding" media that must have a reliable life greater than
that of the systems that read it. There can be no firm statement of this duration but one might
safely plan in term of, say, a decade, on the assumption that obsolescence will overtake any
computer system in less than a decade.
4.4.3.1Archival Objects as an Option for Repository Interoperability or
Exchange
A well designed archival digital object may also function as an exchange
object or as the
communications form of the digital object. This idea is noted here not because the
Prototyping
Project will undertake to exchange objects with other repositories but because the idea offers an
additional slant on the potential definition for an archival digital object.
Nationally and internationally, the library community has high interest in the interoperation
of
repositories, such as those under development within the Digital Library Federation (DLF).
The most interesting form of interoperation is interoperation for access. The ideal
expression of
this form of interoperation would empower a user to discover and access digital resources held
by
a variety of institutions. A related interoperation-related activity, however, would be is the
exchange of digital objects, in which one organization makes a copy of an object available to a
second organization to deposit (load) in their repository.
4.4.4 Preservation Program Metadata
The metadata associated with the digital object should also record the special
information
needed
by those who manage a library's preservation programs. Examples of
these types of data are listed in documentation of metadata captured in the Library's
Coolidge-Consumerism Experiment. This text references documentation pertaining to
preservation program information available from the Research Libraries Group, Cornell
University, and the University of California, Berkeley.
4.4.4.1 Preservation Programs and Metadata Traditionally Captured
Preservation programs in libraries and archives oversee the institution's policies and practices
regarding all forms of preservation. The mission statement of the Preservation Directorate at the
Library of Congress states that the office will "assure long-term, uninterrupted access to the
intellectual content of the Library's collections . . . . directly through the provision of
conservation, binding and repair, reformatting, materials testing, and staff and user education;
and
indirectly through coordinating and overseeing all Library-wide activities relating to the
preservation and physical protection of Library material."
The Library of Congress Preservation Directorate participates in the development of national
and
international standards and guidelines, e.g., for the practices used when microfilming. When
materials are treated at the Library, e.g., reformatted by microfilming, the staff ensures that there
is appropriate record keeping and the communication of information to other libraries and
archives about the actions taken. For example, a preservation microfilm must contain
information--typically in the form of text pages and targets--that describe the materials
represented and offer technical references, like resolution targets. And bibliographic or other
descriptive information is updated or created to communicate what has been done. The
digital-object metadata described in the preceding section is comparable to the information
traditionally
recorded in the course of "analog" preservation.
4.4.1 Persistent Archive Design
NPACI
UC San Diego, MC 0505
9500 Gilman Drive
La Jolla, CA 92093-0505
619-534-5000 [fax: 619-534-5152]
info@npaci.edu
Go to top
Go to AV Prototype Project Documents
Go to AV Prototype Project Home
(10/19/99, rev 4/2/01)