Glitching Files for Understanding: Avoiding Screen Essentialism in Three Easy Steps

PBS Off the Book has a nice short video on The Art of Glitch. It’s a fun story about a born-digital art phenomena, but aside from that, I think it’s useful at helping us better understand the nature of digital objects. In the video, artist Scott Fitzgerald  gives the following concise argument for the value of glitching, or breaking copies of digital files on purpose.

“Part of the process is empowering people to understand the tools and underlying structures, you know what is going on in the computer. As soon as you understand the system enough to know why you’re breaking it then you have a better understanding of what the tool was built for.”

I think we would all do well to develop a more visceral sense of what files exactly are,  and I think some of his tactics for glitching can help with that.

A different way to read an MP3

changing a file extension

Digital objects are encoded information. They are bits encoded on some sort of medium. We use various kinds of software to interact with and understand those bits. In the simplest terms software reads those bits and renders them. You can get a sense of how different software reads different objects by changing their file extensions and opening them with the wrong application.

For example, you can  listen to this performance of the West Virginia Rag from Fiddle Tunes of the Old Frontier: The Henry Reed Collection. From that page you can download a .mp3 and .wav copy of the recording. Once you’ve done that, instead of opening and playing the files with a media player, try changing the file extension to .txt and then open the file up in your text editor of choice.

Below you can see an example of the kind of mess you can create by changing a file extension. My text editor has no idea what to do with a lot of the information in this mp3. The text editor software is attempting to read the bits in the file as alphabetical characters and it isn’t having a lot of success.

The MP3 opened in a text editor

While is a big mess, notice that you read some text in there. Notice where it says “ID3″ at the top, and where you can see some text about the object and information about the collection. What you are reading is embeded metadata,  a bit of text that is written into the file. They are part of the  ID3 tags. We can read them in a text editor because the text editor can make sense of those particular arrangements of information as text.

Another way to view an MP3

Now, if you go back, and change the extension again, you can get something that looks a bit more interesting. This time, change it from .txt to .raw and open it in some image editing software. Here is what I saw when I did that with both a .mp3 version of the file and a .wav version. The black and white pixelated images below are screenshots of my image editing program attempting to read the MP3 as a RAW file. These are visual interpretations of the particular set of the information in those audio files.

viewing the .mp3 as a .raw

viewing a .wav of the same recording as a .raw

Look at the difference between the .mp3 on the left and the .wav on the right. What I like about this comparison is that you can see the massive difference between the size of the files visualized in how they are read as images. Notice how much smaller the black and white squares are. It’s also neat to see a visual representation of the different structure of these two kinds of files. You get a feel for the patterns in their data.

Beyond just incorrectly reading these kinds of files, we can use the same sort of tactics to start to incorrectly edit them and further expose the logic of how they are encoded.

Edit an Image with a Text editor

A similar approach works with digital images. For example,  start with this image, “Sod house, Grassy Butte, North Dakota, on Catherine Zakopayko farm.” If you download the .jpg version of the image, and change it’s file extension to .txt you can open it up in a text editor. It will look like gibberish. In this case, because of the way that compression works on .jpg files you can delete chunks of the file in the text editor, save the file, change the extension back to .jpg and see what would happen if the particular chunk of the file was lost.

You can see comparisons between the original image and two levels of degradation I created by cutting out chunks of the data in the file and copying and pasting parts of it into itself.

original image

Some degradation of the image

Extensively damaged image file

In the second image, notice how the removal of a block of information has degraded the image. The entirety of the image is still there, it’s just that a rectangular region is magenta and two slices across the image are grey. The compression algorithms used to create jpg files mean that removing a chunk of the file doesn’t necessarily remove a chunk of the image, it results in removing some of the information that is layered into the image. In the further degraded image you can see how additional removal can result in big stripes of grey and similar kinds of color problems.

What was that about Screen Essentialism?

New media and digital humanities scholars have coined the phrase “screen essentialism” to refer to a problem  in many scholarly approaches to studying digital objects. The heart of the critique is that digital objects aren’t just what they appear to be when they are rendered by a particular piece of software in a particular configuration. They are, at their core, bits of encoded information on media. While that encoded information may have one particular intended kind of software to read or present the information we can learn about the encoded information in the object by ignoring how we are supposed to read it. We can change a file extension and read against the intended way of viewing the object.

This might seem like a rather academic point, however, I think it suggests the value of understanding the integrity of digital objects not simply as “looking right” in one particular reading out to the screen. In many cases, the integrity of the objects is something that can be expressed through a range of software enabled readings of it.

I’m curious to hear what folks have to say about these glitched files? What other things can they tell us about how these files work? Are there other ways to glitch files that you know of that you think can facilitate the same kinds of understanding? Lastly, what do you make of screen essentialism?

Step-by-Step Management of Born-Digital Content Received on Physical Media

I like lists. I particularly like ordered lists. I’ve even read a book about checklists. Which is one of the reasons I wanted to point out a recent OCLC report, You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media(PDF). The report focuses on practical approaches institutions …

Read more »

Talking About Museums and Digital Preservation

In anticipation of the Museum Computer Network conference next week in Seattle, I’ve been giving some extra thought lately to museum community involvement in digital preservation. We (the National Digital Information Infrastructure and Preservation Program, that is) work with many partners from a range of industries, and in the last couple of years this has …

Read more »

Fixity and Fluidity in Digital Preservation

Kent Anderson offers a provocative post in The Mirage of Fixity — Selling an Idea Before Understanding the Concept.  Anderson takes Nicholas Carr to task for an article in the Wall Street Journal bemoaning the death of textual fixity.  Here’s a quote from Carr: Once digitized, a page of words loses its fixity. It can change …

Read more »

Using Wayback Machine for Research

The following is a guest post by Nicholas Taylor, Information Technology Specialist for the Repository Development Group at the Library of Congress. Prompted by questions from Library of Congress staff on how to more effectively use web archives to answer research questions, I recently gave a presentation on “Using Wayback Machine for Research” (PDF). I …

Read more »

The is of the Digital Object and the is of the Artifact

Fixity is a key concept for digital preservation, a cornerstone even. As we’ve explained before, digital objects have a somewhat curious nature. Encoded in bits, you need to check to make sure that a given digital object is actually the same thing you started with. Thankfully, we have the ability to compute checksums, or cryptographic hashes. This …

Read more »

Read All About It! An Update on the National Digital Newspaper Program

Here at the Library of Congress, there are many projects underway to digitize and make available vast amounts of historic, archival material.  One such project is the National Digital Newspaper Program, providing access to millions of pages from historic newspapers (a previous blog post provides an introduction).  Deb Thomas, NDNP program coordinator here at the …

Read more »

If You Can’t Open It, You Don’t Own It

On October 17, I had the extreme pleasure of hearing Cory Doctorow at the Library for talk entitled “A Digital Shift: Libraries, Ebooks and Beyond.”  Not surprisingly, the room was packed with attentive listeners. The talk covered a wide range of topics–his love of books as physical objects and his background working in libraries and …

Read more »

Revisiting NISO’s “A Framework for Building Good Digital Collections”

Today’s guest post is by Carlos Martinez III, a Hispanic Association of Colleges and Universities intern in the Library of Congress’s Office of Strategic Initiatives. The National Information Standards Organization provides standards to help libraries, developers and publishers work together. Their report, A Framework Guidance for Building Good Digital Collections, is still as helpful to organizations today …

Read more »

Bits Breaking Bad: The Atlas of Digital Damages

A question popped up in the blogosphere recently.  “Where is our Atlas of Digital Damages?” asked Barbara Sierman of the National Library of the Netherlands. She pointed out the amazement that would greet evidence of physical books, safely stored, with spontaneous and glaring changes in their content or appearance.  “Panic would be huge if this …

Read more »