The Hannah Arendt Papers

The Hannah Arendt Papers: Building the Digital Collection

Scanning Specifications

Digitizing Manuscript Material

Oversize Documents

Blank Pages and Versos of Clippings

Scanning the Speeches and Writings

EAD Finding Aid


Scanning Specifications

During the course of the Arendt Papers project, the entire collection of 25,000 items was scanned, for a total of approximately 75,000 digital images. A selection of these is accessible on the Internet. The complete version of the digital collection is made available to researchers in the reading room at the Manuscript Division of the Library of Congress, at the New School University's Hannah Arendt Center at the Fogelman Library, and at the Hannah Arendt Center at the University of Oldenburg, Germany. See About the Collection for more information. The following sections on the digitization process apply to the collection as a whole.

The Arendt Papers were scanned as 300 dpi grayscale images which were compressed using JPEG compression, producing images in the JPEG File Interchange Format (JFIF). Typically, the National Digital Library Program (NDLP) has used grayscale to digitize historical manuscripts in order to capture and display the diversity of tones in manuscript items and the various nuances of handwriting in pencil and ink. The grayscale format can often also suppress the bleed-through typical of handwritten documents in the Arendt Papers. Because JPEG images require considerable time to download, grayscale GIF images were also created to provide convenient access using the NDLP page-turner feature.

The materials were scanned on site by the NDLP paper-scanning and text-conversion contractor, Systems Integration Group (SIG) of Lanham, Maryland. The PULNiX MFCS-50 H/S Digital Overhead Scanning System, a fixed-array device capable of scanning 8-bit grayscale and bitonal images, was used to digitize all the items in the collection. These included manuscripts, bound volumes, and oversize materials. The Arendt production team and Systems Integration Group staff worked with the Library's conservators to ensure proper handling of the manuscripts during the physical processing of the collection and subsequent scanning.

Top of Page

Digitizing Manuscript Material

Because efforts were made to preserve the look of the original documents, digital images reflect a document's physical condition. Many of the original items are discolored, stained, or fragile because of age and past handling. Their digital images, therefore, may show discolorations, heavy fold markings, and varying tones in the paper. Items on unusually thin paper sometimes show bleed-through, a condition in which the ink or printing on the verso (back) of a page can be seen on the recto (front), that even the grayscale format could not suppress. A few letters written on colored paper have produced darker images because they have been digitized in grayscale format and not in color. Some digital images of correspondence appear to have light or faded text that might be difficult to read because the handwriting strokes are very thin or the ink or pencil has faded on the original materials. Since some of the photographs have faded over time or were originally dark, their digital images may be dark as well. Those scanned from photocopied clippings appear to be bitonal rather than grayscale.

Since the Arendt Papers is a twentieth-century collection, the materials were in relatively good condition. During scanning preparations, some of the more delicate materials were housed in acid-free paper sleeves, from which the items were removed for scanning. Clear mylar sleeves protected extremely fragile or brittle items and allowed scanning without removal of the items. Occasionally, the presence of mylar can be detected in the digital image, as in the visibility of the mylar's edges, but for the most part the mylar sleeve is not noticeable.

Arendt often copied her responses on the verso of the letters that she received. When her response is more than one page, it often begins on the verso of the last page of the letter and continues backwards on the rest of the pages. To ease the online viewing of these items, all pages of the letter to Arendt were scanned first, in order, then Arendt's reply was scanned in order.

Top of Page

Oversize Documents

The finding aid for the original collection lists oversize documents in a separate series. These items are housed separately, in larger boxes, and pointer sheets in the location from which they were pulled alert researchers to their location. Because the digital environment takes away the need to separate items by physical size, these items were digitized in sequence and appear online within their original folders. Therefore, they are not listed separately in an oversize series. The EAD finding aid continues to reflect the housing of the physical collection and makes reference to the oversize series.

Occasionally, digital images for oversize items, such as newspapers or book galleys, were cropped more closely than the standard 1/4-inch border that shows page edges in order to allow larger pages to fit in one image. In these cases, no text was affected; only the edges of the paper were cropped where no text appeared.

Top of Page

Blank Pages and Versos of Clippings

Although the Arendt Papers were scanned in their entirety, there were a few instances in which a side or page that contained advertisements or other unrelated content was not captured.

For the most part, only the "relevant" sides of a newspaper clipping were scanned: those that contained the articles that had been clipped. In these cases, a marking was placed on the verso of clippings when only advertisements or other unrelated articles were easily identified, alerting the scanning contractor to skip that side when scanning. This procedure occurred most frequently in the Clippings File, though it was also followed when appropriate in other parts of the collection.

For bound volumes, such as address books, notebooks, and calendars, sequences of more than five blank pages--i.e., completely blank or with printed/standard text such as calendar dates, etc., but no handwritten text--were not scanned. These gaps are indicated with a replacement target image displayed at the point at which the blank pages began and stating the extent of the omitted pages. For example:

"Blank Pages Omitted: Pages 12 through 31"

"Blank Pages Omitted: Blank pages dated October 5, 1975, through December 31, 1975, and blank page titled 'Cash Account---January.'"

Top of Page

Scanning the Speeches and Writings

Caption Below


Examples of additional text taped onto the sides of the page with replacement text taped over original text. In the second image, the replacement text is removed so that the original text can be seen.

Speeches and Writings, Books, On Revolution, First draft, Chapter II, Pages 74-100. The Hannah Arendt Papers (The Library of Congress Manuscript Division).

Special consideration was given to scanning draft typescripts in the Speeches and Writings File and Addition I. These items were particularly complicated because of the countless fragments of text and editorial overlays resulting from Arendt's writing and editorial process. In most instances, the fragments had been taped or glued to the margins of relevant pages and folded over when inserted into folders.

Fragments still attached to a page were simply unfolded for scanning and the entire page captured as one image. In many instances, however, the fragments had become separated from the original document, though residual tape stains at times offered some clues as to their intended location. During reprocessing of the collection, detached fragments were identified, grouped with relevant pages, and placed in paper sleeves. When these items were ready to be digitized, all the pieces were reassembled, placing the main block of text in the center and using the visible yellowed tape marks to place fragments where they would originally have been taped. If it was not possible to determine a fragment's previous location on the page, the fragment was placed below the relevant page and all pieces scanned as one image.

Large overlay fragments, once taped over the original document to replace entire paragraphs or pages, were scanned in a similar way. Again, some fragments were still attached and scanned without removal, but in instances in which some had become detached, the original page was captured first with the editorial overlay, then captured a second time with the overlay removed, sometimes revealing text underneath.

Top of Page

EAD Finding Aid

The existing finding aid was revised to reflect changes in the organization of the collection. It has been encoded in conformance with the Encoded Archival Description's (EAD) Document Type Definition (DTD), a Standard Generalized Markup Language (SGML) standard designed for encoding finding aids.

Top of Page


The Hannah Arendt Papers