Abraham Lincoln Papers

Abraham Lincoln Papers: Building the Digital Collection


Digitizing Microfilm | Digitizing Text | Database Access | Online Release History

Digitizing Microfilm

The Abraham Lincoln Papers were microfilmed and indexed in 1947. In 1957 Congress instituted the Presidential Papers Project, the goal of which was to process and microfilm the papers of presidents held by the Library of Congress. In 1959, the Abraham Lincoln Papers microfilm and index were reviewed as a part of this larger program. Additional items were added to both the microfilm and index. The Abraham Lincoln Papers, comprising 97 reels, were captured on 35-mm roll microfilm.

Microfilm collections of historical documents present a number of issues for digitization resulting from the quality of the microfilm being scanned. In addition, there are the issues of original document condition, a wide range of tonal values, document sizes, and document orientation on the microfilm. For optimal capture of detail, the Lincoln Papers microfilm was raster scanned from a duplicate negative microfilm. The negative can reduce the appearance in digital images of flaws, such as dust, which can be found on the scanning microfilm. The negative was printed directly from the archival microfilm and produced for scanning by both the scanning contractor, Preservation Resources, and the Library of Congress Photoduplication Service. The scanning was performed offsite by Preservation Resources in Bethlehem, Pennsylvania, under contract to the National Digital Library Program.

The digital images were produced in JPEG File Interchange Format (JFIF), a compressed grayscale format often used in digitizing historical manuscript documents because of its ability to capture and display a wide range of tonal variations from those in the document paper itself to diverse qualities of pencil and ink. This 8-bit grayscale capture can also suppress the bleedthrough typical of handwritten documents in the collection. Grayscale GIF images were then created for preview access online. Most of the GIFs were created from lossless LZW compressed TIFF images by Preservation Resources. Some were created by National Digital Library Program staff from delivered JPEGs. Four-bit grayscale GIF images provide maximum legibility since the JPEG archival image requires considerable time to download.

The total number of digital images that compose the Abraham Lincoln Papers is approximately 122,000: that is, 61,000 each of JPEG, GIF, and TIFF files. All of the original capture "master" TIFF images, to which LZW lossless compression was applied, were transferred to the National Digital Library Program on magneto-optical disks and now reside in the NDLP digital file repository for American Memory. The complete collection of digital files occupies approximately 100 GB of server space.

In the Abraham Lincoln Papers, the majority of materials are single- or double-page manuscript leaves. Individual manuscript leaves, originally folded to make two to four pages or writing surfaces, have not been split. Splitting of two-page formats of booklike materials, which are uniform in presentation, does not compromise the viewer's sense of the original artifact. In a few exceptions, such as pamphlets or other bound material, in which loss of content meaning would result, the frame was not split.

In detail and levels of tonal range, the grayscale digital image is an improvement on the microfilmed document. However, the Lincoln Papers were microfilmed in 1947 and practices followed at that time differ from those observed in the later Presidential Papers Project, in which the George Washington, Thomas Jefferson, and other presidential papers were filmed and indexed. (See About the Collection)

The following notes on the digital images made from the Abraham Lincoln Papers microfilm may be helpful:

Example of description below Example of description below Example of description below
A. The inclusion of index cards placed along the bottom or side margins of the first page of documents and filmed in the same frame with the associated document. Once the resulting digital image of the frame has been cropped to the document, the index cards often still show partially. However, these index cards do not obscure text.
EXAMPLE
B. The inclusion of small rulers placed at the bottom of the first page of the documents. These rulers were so placed to provide information about the size of the original document. In the resulting cropped digital image, only a portion of the ruler remains visible. These rulers do not obscure text.
EXAMPLE
C. The filming of pages through reel sprockets. This occurs very occasionally throughout the microfilm. Occasionally, the sprockets have elided small amounts of text.
EXAMPLE

Digitizing Text

The Lincoln Studies Center, Knox College, Galesburg, Illinois, provided annotated transcriptions for approximately 10,000 documents, about half of the documents in the Abraham Lincoln Papers (see Editors' Preface to the Transcriptions). These were keyed in wordprocessing software by Lincoln Studies Center editors, following guidelines provided by the National Digital Library Program. They were delivered to the NDLP as Rich Text Format (rtf) files, which were then converted to Standard Generalized Markup Language (SGML) by an OmniMark 5.0 program customized to the American Memory DTD. All text was then translated with an OmniMark program from SGML to HTML 3.2 for indexing and viewing with Web browsers.

Linking from text transcriptions to individual manuscript documents in the Lincoln Papers was accomplished by means of a unique identifier in the encoded text that matched the "ID" of the bibliographic database record for the document images.

Database Access

Access to the Abraham Lincoln Papers is through a database created from the printed Index to the microfilm edition of the Papers (see About the Collection) and through searchable text transcriptions where available. Every record in the database contains the name of the author of the document, the associated date, and a link to the set of document images. In addition, three other fields capture appropriate information: the correspondence recipient's name, brief explanatory notes and a link to a transcription where available.

Online Release History

An introductory, or demonstration, release of approximately 2,000 documents was made available in February 2000. This release contained document descriptions from the database and annotated transcriptions that were "works in progress." These were updated and re-released as part of the first release in February 2001. That release included approximately 54,000 each of JPEGs and GIFs,for a total of 108,00 images, and approximately 3,500 transcriptions. The GIFs are preview images and the JPEGs are higher quality reference images. Annotated transcriptions underwent corrections and updates throughout the length of the Lincoln Studies Center transcription project, which ended in January 2002. The second and final release in March 2002 included revised document descriptions and annotated transcriptions from the introductory and first releases and the new release in final form of the remaining images and transcriptions for the collection. The complete Abraham Lincoln Papers online consists of approximately 20,000 documents, 61,000 digital images each of GIF and JPEG formats, for a total of 122,000 digital images, and approximately 10,000 transcriptions.


Abraham Lincoln Papers