Rescuing the Tangible From the Intangible

They’re the red-headed stepchildren of the digital age. They’re neither retro chic (all things being relative, of course) like the server arrays that support “big data,” nor are they as cute as the thumb drives made to look like your favorite Star Wars character (or more oddly, chicken feet).

Compact Discs in the Geography and Maps Division. Photo Credit: Butch Lazorchak

Compact Discs in the Geography and Maps Division. Photo Credit: Butch Lazorchak

Of what do I speak? The lowly compact disc, of course. It’s demise has been predicted since at least 2007, but it trudges on, becoming less and less the medium of choice for digital audio distribution (though sadly, still one of your only choices for CD-quality digital audio) but hanging on as a small-scale distribution and storage container for digital data of other sorts.

While some of you may have stopped purchasing CDs for your own listening pleasure, the Library of Congress continues to collect them in huge quantities.

For example, according to Rene Sayles, a Library Technician in the Geography and Maps division, over 700 new CDs come in every month from just one source: the National Geospatial-Intelligence Agency (and especially its Controlled Image Base series).

In addition to this cascade of new items, the Library has droves of valuable CD-only historic material already in its possession, with even more material stored on really endangered storage media such as ¼ or ½ half inch floppy discs, zip discs, mini DVDs, digital audio tape, digital linear tapes and many more. A rough estimate is that the Library has more than 300 terabytes of data stored on these devices, with the potential for it all to be locked-up when players for them no longer exist.

Library Technician Rene Sayles operating the Ripstation. Photo Credit: Butch Lazorchak

Library Technician Rene Sayles operating the Ripstation. Photo Credit: Butch Lazorchak

Sayles is part of a noteworthy project at the Library called the Tangible Media Project. TMP is working to get valuable Library digital collections off rapidly deteriorating physical (“tangible”) media into digital storage environments where they can be managed, backed-up and preserved for the long-term while also being made potentially more accessible.

On the one hand, TMP is a physical process with a surface simplicity that masks layers of complexity. On the other hand, TMP is a significant experiment in the reexamination of the Library’s entire digital workflow. Despite the high stakes, TMP project manager Moryma Aydelott is cheerfully copacetic about the challenges of leveraging TMP to help implement institutional change.

“The ultimate goal,” she says, “is a generic workflow for the aggregation, bagging, transfer, inventory, access, verification and long-term storage of digital materials that can be used by multiple Library divisions. The shorter term goal,” she adds with a laugh, “is basically triage.”

The TMP concentrates on collaborative efforts with Library division partners in the physical aspects of getting the materials off tangible items and into disc and tape storage, but that’s only one aspect of how their work fits into the still-developing overall Library digital workflow.

A visit with Sayles and Aydelott in the Library’s Geography and Maps division showcases the challenges the project faces in space and time management, attention to detail and the limits of hardware and software technology.

The system is stored on a portable cart (a special request: pneumatic tires!) that Aydelott transports to different Library divisions. Geography and Maps and the European and Asian divisions are the most active at this point, but more are slowly getting involved as the word gets out on the program’s success.

The Ripstation. Photo Credit: Butch Lazorchak

The Ripstation. Photo Credit: Butch Lazorchak

The TMP process starts with a tool that lets a user to load multiple CDs and have the contents of each be ripped and stored automatically without having to load the discs one at a time. Sayles demonstrates how they run through the process twice: once to capture the individual files, and once to capture an ISO image of the entire disc. The Library creates two different copies in order to hedge its bets. Will future users prefer accessible individual files or a copy of the entire disc with all the components (in some cases the data files along with specialized software required to read them) stored together in a compact whole? Only time will tell.

The Library’s goal is to make the newly ripped content as available as the original, tangible item, while also exploring ways to make the content even more available as legal and appropriate. Copies are made available to staff and patrons as permitted by Copyright law and relevant agreements. Currently, patrons can access several series of government works:  other items are copied to restricted locations and made available only to staff.

Once ripped, the materials are pushed into the Content Transfer System, a Library-wide tool for receiving, moving, verifying, validating, characterizing, inventorying and auditing digital content. The CTS is the first of the Library’s internal repository services to support digital item management across the Library’s lifecycle and is undergoing rapid and continual development. The CTS model cleaves closely to the well-understood concept of file systems, where files are arranged in hierarchical directories. It adds the concept of bags, which are sets of files and directories which move through the digital lifecycle together.

CTS and TMP activities are being driven by, and feeding back into, the Library’s Digital Preservation Criteria Working Group. The DPCWG’s objective is to identify current practices and policies for the preservation of digital material across the entirety of the Library and to design a path forward for workflows, practices and policies for the preservation of digital materials. The group is undertaking a multi-year effort to re-imagine the digital lifecycle at the Library based on its current strategic plan (PDF), annual objectives and performance targets and build services to more fully ensure that the Library takes advantage of institutional knowledge, collaborative opportunities and economies of scale to preserve the increasingly large amounts of digital information it is acquiring.

With all the activity going on, it’s no wonder that the TMP has become the public face of these initiatives within the Library. In a rapidly transforming world where digital objects are often mysteriously distant and abstract, the TMP provides a physical, tangible gateway to digital management and preservation issues.

You can see more photos of the Tangible Media Project on our Facebook page.

This post was updated on July 5, 2012 to fix a link to the strategic plan.

5 Comments

  1. D.K.
    July 2, 2012 at 5:34 pm

    Loved reading about the Ripstation!

    The link to the strategic plan in the next to last paragraph does not work.

  2. Larry Medina
    July 3, 2012 at 10:26 am

    I’d be interested in knowing a bit more about these aspects of the process “…, verifying, validating, characterizing, inventorying and auditing digital content…” Especially the characterizing aspect.

    Given the fact the content could be anything form “soup to nuts”, how do you determine the types and depth of the metadata to capture on each file from the converted/transferred discs? Do you use pull down menus where criteria is selected to ensure consistent and compatible content is captured or is it more free form than that?

    Also, once captured are you retaining the CDs and including a pointer to the source material post capture?

  3. ralph
    July 3, 2012 at 6:19 pm

    “… over 700 new CDs come in every month from just one source: the National Geospatial-Intelligence Agency”

    Hasn’t the government heard of Blu-Ray?

  4. Bill LeFurgy
    July 5, 2012 at 11:01 am

    D.K. Thanks for your comment. The seems ok–check it now.

  5. Butch Lazorchak
    July 5, 2012 at 11:09 am

    We mistakenly included an internal staff link to the Library’s Strategic Plan. We’ve since corrected it to a public link.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.