Miller NAWSA Suffrage Scrapbooks, 1897-1911

Building the Digital Collection

Digital Images

The seven scrapbooks in this collection were scanned in color at 300 dots per inch (dpi) on a PhaseI and Digi book camera. JPEG derivatives were created of the images that were compressed using JPEG compression, producing images in the JPEG File Interchange Format (JFIF). The scrapbooks were digitized in two phases. The first pass of digitization included scanning the scrapbooks page by page as they appeared without opening any of the folded or obscured items. After this phase of digitization, the books were then examined and treated by the Conservation Division. Folded items such as newspaper articles and letters were opened and prepared for the second pass of digitization. The second phase of scanning captured all of the items that were folded or otherwise obscured.


All items with typed text were treated with optical character recognition (OCR). A variety of software programs were used for this endeavor including PrimeOCR, TextBridge, ABBYY and FindReader Professional 7.0. The error rate for newspaper articles was generally high most likely due to their brittle and yellowed condition, whereas the pamphlets had a much higher level of success. Newspaper articles and other items that failed OCR capture were then manually zoned and left uncorrected. All pertinent handwritten items, i.e. letters, were fully transcribed and placed into SGML. Both forms of text are searchable through the 'Full Text' search option.

Bibliographic Records

Items in the scrapbooks were described in an Access database and were treated as Non-MARC records. There are a total 1,812 individual records that comprise the contents of the scrapbooks. In cases were multiple items on a scrapbook page are related by topic, a group record was created reflecting the topic of discussion. An example of this would be multiple newspaper clippings pasted on a single page of a scrapbook that speak of the same topic. In these cases, clippings are described in one record in the database. Items of like physical genre, i.e. invitation card, button, were always described in a separate record even if other items on the page were  topically related so that they can be accessed by their physical type. For example, all membership ribbons can be searched and retrieved by conducting a search on 'ribbon' in the 'Keyword' search area.