The Nineteenth Century in Print: Periodicals

Building the Digital Collection

This collection is part of a distributed digital library collaboration. The periodicals currently incorporated come from the collections of the Library of Congress and Cornell University Library.

Digitizing the Collection

This collections consists of twenty-two popular nineteenth century periodicals digitized by Cornell University Library and one digitized by the Library of Congress. The conversion specifications and approach follow the model established by Cornell and the University of Michigan in the Making of America project.

The materials were scanned from the original paper source. Both institutions outsourced the scanning. The images were captured at 600 dpi in TIFF image format and compressed using the lossless CCITT Group 4 compression algorithm. Minimal document structuring occurred at the point of conversion, primarily linking image numbers to pagination and tagging pages as the first page of an issue, a title page, or as a page with advertisements, etc.

The text for the twenty-two periodicals converted by Cornell University Library was generated by a fully automated process of optical character recognition, with no human intervention beyond initial calibration. The OCR process was implemented by Cornell University Library staff. A similar process was used by the University of Michigan Digital Library Production Service to prepare searchable text for Garden and Forest for the Library of Congress.

The text file for each page of a periodical was then processed, again automatically, to generate a file marked up in the Standard Generalized Markup Language (SGML) for each volume. The markup uses a simple Document Type Definition (DTD) developed by the University of Michigan Digital Library Production Service. The DTD is conformant to the guidelines of the Text Encoding Initiative (TEI). The text files for Garden and Forest, Scientific American, and Living Age were encoded according to the recommendations for Level 1 in the TEI in Libraries Guidelines. For the other periodicals, Cornell University Library keyed author names and article titles from index volumes into a database. This basic bibliographic information was merged into the text file automatically, using page numbers as the key. The resulting encoding is at Level 2 of the TEI in Libraries Guidelines.

More details about the digitizing and delivery of the periodical Garden and Forest are provided by the Library of Congress Preservation Reformatting Division.


Return to The Nineteenth Century in Print: Periodicals