RFQ Attachment 4
Digital Audio-Video Repository System Support
Attachments to an RFQ


This is one of seven attachments to the Library's Request for Quotes in a limited competition to select a vendor to develop a prototype digital repository. The prototype will test potential approaches for the preservation of recorded sound and moving image collections. The attachments provide a sketch of the Library's concepts, circumstances, and proposed actions as of July 1999, and prospective vendors were asked to discuss these ideas in their proposals.
4. PRESERVATION ISSUES


Traditional Preservation and the Importance of Media
Electronic Reformatting of Audio and Video
The Preservation of Digital Information
Preservation and the Prototyping Project

4.1 Traditional Preservation and the Importance of Media

The classic paradigm for library preservation features media that will endure for long time spans. Libraries have sought to persuade publishers to print on acid-free papers so that physical books will last for centuries. Library preservation programs have copied materials onto microfilm of a particular physical type in order to create a human readable copy that aging tests suggest will last for more than 100 years. In the face of this paradigm, it is no wonder that the discussion of electronic reformatting has produced concern within the library preservation community.

4.2 Electronic Reformatting of Audio and Video

Electronic formatting has been an inescapable part of audio and video preservation programs for several decades and, in this arena, there is no such thing as a human-readable copy. Deteriorating or endangered sound and moving image recordings have been copied from one tape to another in order to keep the underlying content alive through time. In the face of the seemingly short life spans of tape media, preservation professionals have examined various optical disk media with interest, only to discover that the longer life span of the media may not be matched by the life span of the hardware and systems required to recover the stored content. The experience of reformatting in this general context has led specialists in the field to seek alternate models for reformatting and preservation copying.

The Library assessed the preservation of video materials throughout the nation and published its findings in A Study of the Current State of American Television and Video Preservation (Library of Congress: Washington, 1997; ISBN: 0-8444-0946-4). In another Library of Congress study, the consultant William D. Storm outlined ways in which the migration of audio and video materials into a computer-data environment could address the problems associated with conventional reformatting of these original formats. Storm's report is titled Unified Strategy for the Preservation of Audio and Video Materials (Preservation Research and Testing Series No. 9806; Aug. 1997; rev. Nov. 1998), available from:

Research and Testing Office
Preservation Directorate
Library of Congress
101 Independence Ave. SE
Washington DC 20540-4560.

4.3 The Preservation of Digital Information

The paradigm for the preservation of computer information has a very low dependency on media. Of course, each computer file must be stored on media of some kind. But the computer specialist's assumptions include the idea of short-lived systems and media and the correlative idea that content must be migrated to new systems and media in a routine and systematic manner. The computer specialist also understands that some complex data forms may not be migratable and, in order to continue their use, the obsolescent or obsolete systems or environments in which they function must be emulated.

The family of thorny issues surrounding the preservation of digital information has received considerable comment in recent years. One statement (which cites other relevant work) is Jeff Rothenberg's Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation (report to the Council on Library and Information Science, January 1999, ISBN 1-887334-63-7). Information about this and other reports may be found at the Council on Library and Information Resources (CLIR) website.

At this time, the Library of Congress is not seeking digital solutions for all of its preservation work. Interest is high, however, in promising areas like printed matter reformatting and--relevant here--magnetically recorded audio and video. Many tape media and formats traditionally used for audio and video reformatting activities are no longer manufactured. The condition of tape recordings from the 1960s and 1970s has reached a near-crisis state. At the same time, the development of digital technologies for a variety of activities, including broadcasting, suggest that the current environment is conducive to the adoption of computer-based digital reformatting. Thus the Library wishes to explore the computer-data preservation paradigm as it may apply to the preservation of the digital content that emerges in these areas and as may also emerge from projects whose goal is to broaden access to collections. Can this paradigm be refined to a point that inspires confidence in the library community?

It is worth noting that the Library anticipates carrying out a three-pronged approach to audio and video preservation, at least until there is a high level of confidence in computer-style preservation. This triple approach is that (1) the original items will be retained, properly housed, and stored in suitable environment, (2) conventional "tape-to-tape" copies will continue to be produced, often in analog form, and (3) new computer-digital copies will be made in the manner outlined in this document.

4.4 Preservation and the Prototyping Project

The Library wishes to take advantage of the Prototyping Project to learn more about constructing a computer-data preservation paradigm. The Library understands that the realization of this paradigm in its fullest form will await a multi-year, multi-institution process. The Prototyping Project will explore as many of the following elements as can reasonably be done within the project's compass. The elements include the following:

4.4.1 Persistent Archive Design

The term persistent archive is taken from a talk delivered at the Library of Congress on June 3, 1999, by Reagan Moore from the San Diego Supercomputer Center and the National Partnership for Advanced Computational Infrastructure (NPACI) at the University of California, San Diego. The acronym DICE stands for Data Intensive Computing Environments.

Moore said that his team faced the challenge of finding ways to maintain digital data for hundreds of years in the face of system changes, articulating the challenge in these words: "the technology to instantiate data changes every three years, the technology for data presentation changes every four years, and the technology to archive the collection changes every five years." A set of slides (also in PDF format) that illustrated Moore's talk has been made accessible on the WWW.

Related home page: http://www.npaci.edu/DICE/

Address:
NPACI
UC San Diego, MC 0505
9500 Gilman Drive
La Jolla, CA 92093-0505
619-534-5000 [fax: 619-534-5152]
info@npaci.edu

Related document: Report on Collection Based Persistent Archives

Moore's description of a persistent archive or repository includes the following key features:

4.4.2 Backing Up Repository Content

Most computer backup systems copy what is on disk to tape and sometimes other media. Generally speaking, the format of what is on disk is in the form of files. In the current CNRI repository (see Attachment 2), the system relies on typical backup systems to make copies of the files it contains (both digital objects and repository software). These copies are on disk to tape media.

Thus the repository saves its "state" on appropriate media. If the repository were catastrophically lost one could restore from the media and have the state as of the last backup, just like most software systems. This would include both the repository software and the digital objects that are stored in it. The restoration would take place in an environment that is the same environment as before, using the same repository software. Backup data in this typical scheme is critically important but is not data that can easily be moved or migrated to a different environment or system.

4.4.3 Archiving Digital Content

To archive digital content is to produce a copy that is capable of being migrated to a new system or environment, as well as a copy that is capable of being refreshed, e.g., as one nears the end of the life-span of the media upon which the archival copy is recorded. With large digital stores, the rate of data transfer may be an issue: if an obsolete system is failing, is there enough time to "re-archive" all of its content to a new system? What is needed is an approach to the life-cycle management of digital information.

There have been a variety of informal statements within the digital library community concerning the archiving of digital content. For example, staff at the University of California, Berkeley, have used archival repository as a contrasting term to access repository. The former is designed to preserve the objects it contains while the latter is structured to facilitate access to or the presentation of the objects it contains. Other specialists have referred to archival digital objects in contradistinction to digital objects, with the same intent as the distinction between archival and access repositories.

The use of the word preservation in the name Universal Preservation Format suggests the UPF group's interest in archiving. In their document titled Universal Preservation Format: Part 1: User Requirements, Thom Shepard and Dave MacCarn offer a very rich description of an archival object:

The Prototyping Project provides the Library with an opportunity to construct an archival object, albeit probably not one as rich as the UPF object. The options include at least the following:

Although media in this context does not have the same vital significance as in the classic preservation paradigm--it need not last for decades--it is still important. One might consider the media used for digital archiving as a "holding" media that must have a reliable life greater than that of the systems that read it. There can be no firm statement of this duration but one might safely plan in term of, say, a decade, on the assumption that obsolescence will overtake any computer system in less than a decade.

4.4.3.1Archival Objects as an Option for Repository Interoperability or Exchange

A well designed archival digital object may also function as an exchange object or as the communications form of the digital object. This idea is noted here not because the Prototyping Project will undertake to exchange objects with other repositories but because the idea offers an additional slant on the potential definition for an archival digital object.

Nationally and internationally, the library community has high interest in the interoperation of repositories, such as those under development within the Digital Library Federation (DLF). The most interesting form of interoperation is interoperation for access. The ideal expression of this form of interoperation would empower a user to discover and access digital resources held by a variety of institutions. A related interoperation-related activity, however, would be is the exchange of digital objects, in which one organization makes a copy of an object available to a second organization to deposit (load) in their repository.

4.4.4 Preservation Program Metadata

The metadata associated with the digital object should also record the special information needed by those who manage a library's preservation programs. Examples of these types of data are listed in documentation of metadata captured in the Library's Coolidge-Consumerism Experiment. This text references documentation pertaining to preservation program information available from the Research Libraries Group, Cornell University, and the University of California, Berkeley.

4.4.4.1 Preservation Programs and Metadata Traditionally Captured

Preservation programs in libraries and archives oversee the institution's policies and practices regarding all forms of preservation. The mission statement of the Preservation Directorate at the Library of Congress states that the office will "assure long-term, uninterrupted access to the intellectual content of the Library's collections . . . . directly through the provision of conservation, binding and repair, reformatting, materials testing, and staff and user education; and indirectly through coordinating and overseeing all Library-wide activities relating to the preservation and physical protection of Library material."

The Library of Congress Preservation Directorate participates in the development of national and international standards and guidelines, e.g., for the practices used when microfilming. When materials are treated at the Library, e.g., reformatted by microfilming, the staff ensures that there is appropriate record keeping and the communication of information to other libraries and archives about the actions taken. For example, a preservation microfilm must contain information--typically in the form of text pages and targets--that describe the materials represented and offer technical references, like resolution targets. And bibliographic or other descriptive information is updated or created to communicate what has been done. The digital-object metadata described in the preceding section is comparable to the information traditionally recorded in the course of "analog" preservation.


Go to top
Go to AV Prototype Project Documents
Go to AV Prototype Project Home
(10/19/99, rev 4/2/01)
Library of Congress
Comments: AV Prototype Coordinator (cfle@loc.gov)
Legal | External Link Disclaimer
( August 31, 2010 )