Web Archiving Lifecycle

The 2012 IIPC General Assembly has been well documented on this blog. The research use of web archives, legal issues, special collections and the future of harvest have all been covered in detail. This is the last planned post about the 2012 GA but look for more updates about the IIPC and web archiving on netpreserve.org.

Those new to web archiving or those who are interested in how to integrate web archiving into other digital library programs will most likely be interested in some work in-progress at the IIPC around defining the web archiving lifecycle. A workshop was lead by Kris Carpenter Negulescu of the Internet Archive with a focus on documenting the process of creating web archives for reference and training purposes. This documentation will also help managers who have to make decisions about sustaining archives over time. The discussion revolved around a proposed web archiving lifecycle diagram. It was added that preservation planning along with the policy and collection management decisions is an issue that must be addressed throughout the lifecycle. Each step in this lifecycle is dependent on available resources and some aspects are currently outsourced. An altered diagram and a fuller explanation could address these questions. Associated tools and policies could also be linked to the different steps in the lifecycle.

As helpful as it is to document the web archiving lifecycle there will always be policy and curatorial questions that will need to be decided at each collecting institution. Different institutions have different missions, budgets and workflows. However, the opportunity to share experiences and identify the common problems and solutions is what the IIPC is all about. This was most evident in the discussion about identifying best practices in collecting web sites around emergency events, like earthquakes or oil spills. Diverse skills in languages and subject matter expertise across disciplines are needed along with tools for the nomination and capture of time-sensitive material. Organizations like the IIPC are also helpful in this regard because it is a ready-made group of experts and colleagues available for help, there have been several collections created collaboratively by members. These workflows also need to be documented and shared.

The rapid change of web publishing tools, access methods and a constant training gap will always be a challenge in web archiving but broadly sharing use cases based on a common understanding of the lifecycle can help web archivists advance their craft and can help managers make decisions. Do you have any reactions to the lifecycle diagram? Stay tuned for more updates on this work.

Updated 6/12/2012 1pm EDT. Diagram removed.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.