Preserving.exe: A Short List of Readings on Software Preservation

Most of the conversations I end up in about digital preservation are about the digital versions of analog things. Discussions of documents, still and moving images and audio recordings are important, but as difficult as the problems surrounding these kinds of digital objects are, there is a harder problem: preserving executable content, aka software. Software isn’t simply what we use to render content–it’s is an important form of creative expression, a cultural artifact, a important commodity and an entity which increasingly is enmeshed our economic, political and social systems.

I thought I would start a quick list here of a few of what I think are some nice reads on preserving software. Some of these are posts from our blog, but most are papers and reports that I think do a nice job getting into some of the issues those interested in preserving software face and some of the ways folks are going about preserving software.

Please consider adding and reacting to these with:

  1. Additional papers or readings on the topic and similar brief descriptions
  2. Reactions and comments you have to these readings and any added readings

The Life-Saving Software Reference Library: This interview I did with Doug White from NIST goes into considerable detail on the structure and design of NIST’s software library, which he describes as library of software, a database of metadata, a NIST publication and a research environment. Here is a bit of how Doug explained it, “The research environment allows NSRL to collaborate with researchers who wish to access the contents of the virtual library. Researchers may perform tasks on the NSRL isolated network that involve access to the copies of media, to individual files, or to “snapshots” of software installations. In addition to the media copies, NSRL has compiled a corpus of the 25,000,000 unique files found on the media, and examples of software installation and execution in virtual machines.”

Diagram of the NSRL workflow and work products

The Geeks Who Saved Prince of Persia’s Source Code From Digital Death: This is the most fun story of any of those in this list. Be sure to follow the dramatic events as the original source code for the Apple II version of Prince of Persia makes it’s way off it’s original media and up onto Github.

Toward a Library of Virtual Machines: Insights interview with Vasanth Bala and Mahadev Satyanarayanan: This interview goes into some depth on the design of the Olive Library project. One quote is particularly salient on the potential importance of software preservation: “as all fields of scientific investigations rely on complex simulation and visualization software, the ability to archive these software artifacts in executable form becomes essential for reproducibility of scientific results. Software preservation also enables long term data preservation. Today’s data formats may become obsolete tomorrow, unless the software applications that process those formats are also preserved”

Emulation: From Digital Artefact to Remotely Rendered Environments: Dirk von Suchodoletz, Jeffrey van der Hoeven 2009;  While not directly focused on software preservation, a section of the paper articulates the focuses on some of the needs for and various problems in constituting software archives. Here is a valuable quote “the original software also needs to be preserved if digital objects are to be kept alive via emulation. Guidelines similar to those created for digital objects themselves must be brought to bear in order to safeguard emulators, operating systems, applications and utilities. That is, software should be stored under the same conditions as other digital objects by preserving them in a OAIS-based (ISO 14721:2003) digital archive.”

Preserving Virtual Worlds Final ReportMcDonough, J., Olendorf, R., Kirschenbaum, M., Kraus, K., Reside, D., Donahue, R., Phelps, A., Egert, C., Lowood, H., & Rojo, S. (2010). At 187 pages, this is more of a book than an essay, but it’s full of valuable exploration and discussion of the various issues, problems, and opportunities around preserving video games.

The Attic & the Parlor: Notes from a Workshop on Software Collection, Preservation & Access: The Computer History Museum’s Software Preservation Group hosted what looked to be a fascinating workshop in 2006. You can find the proceedings and presentations online and their wiki also includes a rather extensive directory of software collections. The Attic & Parlor notion in the title focuses on a distinction between highly curated collections and sprawling “gather it all up” collections. This, like the preserving virtual worlds report, focus on the value of collecting source code.

What should we collect to preserve the history of software? Shustek, L. (2006). IEEE Annals of the History of Computing, 28(4), 112 – 111. Another strong argument for preserving source code. “I argue that unless we collect, preserve, and interpret the software code in addition to the related artifacts, we have discarded the software’s intellectual essence. Emphasizing collateral materials puts the focus on the history of products and downplays the development of the scientific and engineering accomplishments that underlie them.”

Preserving Software: Why and How John G. Zabolitzky, Iterations: An Interdisciplinary Journal of Software History 1 (September 13, 2002): 1-8. Zabolitzky makes an impassioned argument for urgent action on software preservation and similarly makes an appeal for the preservation of original source code. “the evolution of software methods, techniques, styles, etc., is described in many books and articles. However, all of that is essentially hearsay: what actually has been done (and what may be different from what the active players in this area may report since they might have wished to do something different) can only be discerned and proven by examining the source code. The source code of any piece of software is the only original, the only artifact containing the full information. Everything else is an inferior copy.”

What essential readings would you add to a list like this? Please consider taking a moment to add them in the comments. Also, feel free to use this comment thread as a place to discuss the various ideas and approaches advocated for in these readings?

All In! Embedded Files in PDF/A

Wouldn’t it be great to have a single technical solution that solves all your long-term digital archiving, stewardship and preservation needs? Perhaps a file format with millions of users, widespread adoption across different computing platforms, free viewers and open documentation? A lot of hopes and dreams have been poured into the idea of “one preservation …

Read more »

November 2012 Library of Congress Digital Preservation Newsletter

The November 2012 Library of Congress Digital Preservation Newsletter is now available. http://www.digitalpreservation.gov/news/newsletter/201211.pdf In this issue: Activist Archivists preserving content from the Occupy Wall Street movement Mapping the Federal geospatial stewardship efforts Interviews with: Lori Emerson, Director of the Media Archaeology Lab; Peter Van Garderen and Courtney Muma, Archivematica; and Christie Moffatt and Jennifer Marill, …

Read more »

Step-by-Step Management of Born-Digital Content Received on Physical Media

I like lists. I particularly like ordered lists. I’ve even read a book about checklists. Which is one of the reasons I wanted to point out a recent OCLC report, You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media(PDF). The report focuses on practical approaches institutions …

Read more »

Using Wayback Machine for Research

The following is a guest post by Nicholas Taylor, Information Technology Specialist for the Repository Development Group at the Library of Congress. Prompted by questions from Library of Congress staff on how to more effectively use web archives to answer research questions, I recently gave a presentation on “Using Wayback Machine for Research” (PDF). I …

Read more »

Revisiting NISO’s “A Framework for Building Good Digital Collections”

Today’s guest post is by Carlos Martinez III, a Hispanic Association of Colleges and Universities intern in the Library of Congress’s Office of Strategic Initiatives. The National Information Standards Organization provides standards to help libraries, developers and publishers work together. Their report, A Framework Guidance for Building Good Digital Collections, is still as helpful to organizations today …

Read more »

Mapping Federal Geospatial Stewardship Efforts

I’m obsessed with maps, especially digital maps. I’m continually amazed by the tools being developed to use location data to make our lives easier. Luckily, this interest dovetails with NDIIPP’s concerns about ensuring that digital mapping survives for the long-term, so I’m regularly scanning the landscape to figure out ways we can engage the wider …

Read more »

The October 2012 Library of Congress Digital Preservation Newsletter is now available

The October 2012 Library of Congress Digital Preservation Newsletter is now available. http://www.digitalpreservation.gov/news/newsletter/201210.pdf In this issue: *Find out how you can help define levels of digital preservation *Reflections on CurateCamp processing *Read about three individuals who are working on the preservation of video games *Learn about the difference between domains and subdomains in web archiving …

Read more »

Getting the DigPres411: An Interview with Lisa Gregory of State Library of North Carolina

The five recipients of the inaugural National Digital Stewardship Alliance innovation awards are exemplars of the creativity, diversity and collaboration essential to supporting the digital community as it works to preserve and make available digital materials. In an effort to learn more and share the work of the individuals, projects and institutions who won these …

Read more »

A Piece of Southern Cultural Heritage Preserved

“We leave Gulfport at noon; gulls overhead trailing the boat—streamers, noisy fanfare— all the way to Ship Island. What we see first is the fort, its roof of grass a lee— half reminder of the men who served there— a weathered monument to some of the dead.” -excerpt from Natasha Trethewey’s “Elegy for the Native …

Read more »