Copyright Records: Short term strategies for making them more accessible

The number of non-digital Copyright records (70 million) and the constraints on funding make the digitization of copyright records a long-term project. But that doesn’t mean we can’t make some records available sooner rather than later. We’re looking at several strategies and are eager for your feedback and ideas.

First, we can demonstrate what’s possible, engage users, and make some records available online through a search and retrieval pilot using a small but complete subset of the records indexed by multiple fields with links to images of the original copyright records. Options include the records of transfers and assignments of copyrights. The 2.5 million catalog cards with indexes to approximately 350,000 documents recorded between 1870 and 1977 have already been digitized. PDF copies of the documents also exist. Transfer and assignment records must be consulted to determine the complete ownership history of any copyright and so their availability online would form a nice complement to the Catalog of Copyright Entries from 1891 to 1977, which are being digitized onsite here at the Library and made available through the Internet Archive website. Nearly two thirds complete, CCE records are now available online back to 1936.   Another option is the set of records of prints and labels registered between 1922 and 1940.  This set is much smaller, about 43 thousand registrations, and could be done sooner but it may be too confined to be a model for the 16 million records referring to many other types of copyrightable material.  A third option is the set of registration records from 1971 to 1977.  This is a much larger set of 7.7 million catalog cards with indexes to 2.8 million registrations and would require considerably more time to complete.  I seek your comments on which of these three options would be most useful to you.

Second, as an interim measure while full record indexing is underway, we are considering making the catalog card images available online through a virtual card catalog organized hierarchically by type of record, time period, drawer name, and card image number. This could be done after digitization of each set and would enable online searching of these records in a manner that mimics searching the actual cards. While this would require a few more steps to search for a particular term, it would enable viewing surrounding records, a feature considered useful by some users.

Third, we are exploring the feasibility, costs, and benefits of optical character recognition and double-blind data capture as possible options for extracting data from copyright records. Indexing 70 million records is a daunting task and way beyond present staff resources. At the same time, the accuracy and integrity of the records is of paramount importance. Through prototyping and piloting and your feedback, we plan to find the optimal approach that will capture the necessary information correctly and completely. Whether captured through keyboarding or OCR there must be a second pass of the data for verification. In concert with this we are considering how we might use crowd-sourcing to engage large numbers of interested persons to help with the data capture and verification.

Fourth, we are going to publicize the project through the Copyright website and other media such as this blog to generate excitement, seek input, and garner support for the project.

As always, your feedback and comments are most important and most welcome.

Where are we now? — Project accomplishments so far

A detailed analysis of the Copyright records has been completed, and test scanning has been done to determine the best digitization parameters for the several formats of the records. For optimal preservation, the records will be scanned in uncompressed tagged image file format (TIFF) at a minimum of 300 pixels per inch (ppi) in 24 …

A vision for making pre-1978 Copyright records more available. — Our goals for the project

A principal goal of the Copyright Office is to digitize the content of the card catalog records. This work is already underway.  The card catalog is considered the most up-to-date index to copyright records prior to 1978. It has been updated over time to reflect corrections and changes sometimes with handwritten annotations and sometimes with new …

Who owns the copyright for that book, song or photo you want to use? — Making pre-1978 Copyright Office records more accessible

From 1870 to 1977 there were 16.4 million works registered in the Copyright Office.  Many are still under protection of the Copyright law.  During that same time, the assignment or transfer of rights was recorded for more than 1.7 million works.  So how do you determine if a particular work is still under copyright and …

