Z39.50 Profile for Access to Digital Library Objects
Draft Four of this profile, dated August 15, is available:
This is a companion profile to the Z39.50 Profile for Access to Digital
Collections (similar to the CIMI profile in that respect).
It has been developed by LC staff, and we now solicit comments. Please review and comment by
September 16.
LC provides access to its digital collections over the web via http and would
like to provide enhanced access via Z39.50, through the use of this profile when
complete. We hope that institutions providing similar collections will provide
access via this profile, and we invite these institutions to participate in its
development. Z39.50 client developers as well as institutions who will want to
acquire clients to access these digital collections (as well as any other
interested parties) are also invited to participate. Text of section 1 of profile
follows
Library
of Congress
(08/15/96)
The Z39.50 Profile for Access to Digital Library Objects (hereafter
referred to as the DL Profile) is a companion profile to the
Z39.50 Profile for Access to Digital Collections referred to as the
Collections Profile.
1. Overview
As a Z39.50 Profile, the DL profile specifies a subset of Z39.50 features
to support functional and user requirements for search and retrieval of
information in digital library collections, specifically the Library of
Congress digital library collections and similar collections.
The use of this profile might be one of several mechanisms used to access
library digital objects. This particular mechanism is distinguished by the
definition of an enveloping structure, called an Object Descriptive Record,
which may (logically) encapsulate a digital object along with information
describing the object. When the Descriptive Record does not (logically)
encapsulate the object it describes, it instead provides a pointer to the
object. In either case, in this profile, Z39.50 access to a digital object
(for search or retrieval) is via its Object Descriptive Record.
1.1 Extensions to Collections Profile
The Collections profile includes several scope limitations, and delegates
responsibility to companion profiles, to extend the scope in these areas. As
a companion profile to the Collections profile the DL profile extends the
collections profile in the following areas:
- The Collections profile treats digital objects as atomic and delegates
to companion profiles any further modeling of a digital object. The DL
profile provides a general model for a digital object; see 1.1.1. The
profile also defines a schema corresponding to this model; see section
3.
- The Collections profile delegates to companion profiles the designation
of categories of digital objects that the companion profile is intended
to support. The DL profile defines categories of digital objects. See
1.1.2.
- The Collections profile treats Associated Descriptions as atomic and
delegates to companion profiles the designation of categories of
Associated Descriptions that the companion profile is intended to
support. The DL profile defines categories of Associated Descriptions.
See 1.1.3.
- The DL profile supports authentication, rights and permissions, access
control, and resource control. See 1.1.4.
- The DL profile supports special characters within a search term, and
language designation of the search term. See 1.1.5.
1.1.1 Model of a Digital Object
This profile provides a general and flexible model for the structure of a
digital object. In this model, a digital object may consist of constituent
parts, any of which may in turn consist of constituent parts, and so on.
Constituent parts are represented as Z39.50 elements.
Consider a single digital object consisting of several images (e.g. photos
or text images). Although the set of images comprises a single digital object,
each must be distinctly representable and the object must convey the fact that
there are distinct images, how many, and their individual characteristics.
Thus they are represented as separate elements of a Z39.50 record.
Next suppose that the digital object not only includes a number of images, but
also additional constituent parts, further structured; for example, each such
constituent part may consist of several images. This introduces an
intermediate level of aggregation.
The model of a digital object adopted
by this profile assumes arbitrary levels of aggregation and is represented as
a tree, where each non-leaf node has an arbitrary number of subtrees and/or
leaves, and leaf nodes represent data.
Every node, whether a leaf or
non-leaf node, has a string tag, whose purpose is to convey to the user what
that node represents. A description (via a description meta-element)
may also accompany any node, in case the string tag is not sufficiently
descriptive.
This model could represent, for example, a digital object
consisting of 10 boxes, each with 20 folders, each with 30 photos. String tags
such as 'box', 'folder', and 'photo' could be used to convey the type of ele-
ment (the type would be conveyed to the user, not the client; this profile
does not attempt to define machine-processible content types). As a more
complex example, a folder might include a variety of photos, maps,
correspondences, etc. and perhaps the correspondences consist of several
sequential digitized pages.
As another example, a digital object may
consist of multiple volumes, each with a table of contents, several chapters,
each chapter with sections, etc. String tags such as 'volume',
tableOfContents', 'chapter', 'section', etc. could be used.
Repeating
elements, and designation of the ordinal occurrence of an element among a set
of repeating elements, is supported by this profile. For example consider an
object with multiple "volumes", i.e. "volume 1", "volume 2" etc. "Volume 2"
would be represented by the second occurrence of the element whose tag is
'volume'. "Third chapter", "fifth image" or "first five pages" would be
similarly represented. Element specification eSpec-1 and record syntax GRS-1
provide these capabilities.
1.1.2 Categories of Digital Object
The profile defines the following categories of digital objects:
- Language-based
- Image-based
- Sound-based
- Motion-based
1.1.3 Categories of Associated Descriptions
The profile defines the following categories of Associated Descriptions:
- Cataloging Record (e.g. MARC record, Dublin core)
- Archival Register
- Header (file Header)
- Web Page
1.1.4 Authentication, Rights and Permissions, Access Control, and Resource
Control
This profile supports:
- Authentication (at Initialization): capability for a client to submit a
user id and password during Z39.50 initialization. See 5.4.
- Rights and Permissions metadata: capability for the server to return
"terms and conditions" or "rights and permissions" information
corresponding to digital information, as metadata along with the
specific digital information. See 3.1.
- Access control: capability for the server to demand authentication (by
prompting the client for authentication information) prior to processing
a request for information. See 5.8.
- Resource Control: The capability for the server to notify the client of
potential or actual resource requirements pertaining to a request for
information, and prompt the client for permission to continue to process
the request. See 5.9.
1.1.5 Character Set and Language Support for Search terms
This profile supports special characters within a search term, and language
designation of the search term. For special characters, support is required
for character set negotiation as specified in the Z39.50 Implementors Group
(ZIG) Implementors Agreements. See
http://www.loc.gov/z3950/agency/agree.html. [Details to be
developed.] For designation of the language of a search term, see 4.3.3.
This profile supports search terms that read right-to-left (e.g. Arabic,
Hebrew); see 4.1.1.
1.2 Pilot Collections
This profile is intended to provide access to the digital collections
described briefly below. This list is intended as illustrative and is by no
means exhaustive; it is used as a representative set of collections on which
the specifications of this profile are based.
1.2.1 Detroit Publishing Company Collection
More than 25000 negatives, 20000 prints, 2900 transparencies, from the
Detroit Publishing Company, 1880-1920. U.S. scenes (mostly), including
buildings, towns, cities, universities, battleships yachts, resorts, natural
landmarks, and industry.
Images available in four versions: GIF, TIFF
thumbnail, reference JPG, uncompressed TIFF.
1.2.2 Nation's Forum Collection
Collection of 59 sound recording of speeches, made to preserve the voices
of prominent Americans; 1918 and 1920. World War I topics, postwar issues, and
the 1920 presidential election.
Also available for each speech:
- Text (ascii)
- Photo of the record label of the sound recording
- Photo of speaker
1.2.3 WPA Life Histories Collection
Life History Manuscripts from the Folklore Project, WPA Federal Writers'
Project, 1936-40. 2900 documents from 300 writers from 24 states. 2000-15000
words each. 23000 page images total. Histories describe informant's family
education, income, occupation, political views, religion and mores, medical
needs, diet and miscellaneous observations.
Each document available in
HTML, SGML (using American Memory DTD and Panorama viewer) and scanned page
image (bitonal TIFF G3 or G4).
1.2.4 Finding Aid for Shirley Jackson Papers
Shirley Jackson was a master American short-story writer and novelist of
the mid-20th century, best known for modern Gothic horror, in particular for
her classic story, The Lottery, 1948. She also wrote stories about
contemporary domestic life. Her papers, given to LC in 1967, consist of
diaries, journals, correspondence, literary manuscripts, and miscellaneous
papers. There are 7400 items, none digitized. The Finding aid is SGML tagged
using the Encoded Archival Description standard, beta version.
1.2.5 Coolidge-Consumerism Collection
17,000 pages (images) of 1920s primary-source materials: manuscript,
monograph, and serials Also photos and motion pictures. Documents various
aspects of economic life in the U.S. during the 1920s. Includes the Calvin
Coolidge Papers, focusing on the life of Calvin Coolidge during the six years
he was president (1923-1929).
Scanned page images (bitonal TIFF G4)
available for the contents of 152 manuscript folders selected from 14
manuscript collections. 73 folders have page images only; 79 have HTML and
SGML also.
Photographs (170) available in GIF, TIFF thumbnail, reference
JPG, and uncompressed TIFF. Text is available for 78 monographs and 56
serials.
1.2.6 Legislative Information System
Currently named THOMAS and available on the Web, the Legislative
Information System (LIS) under development is a constantly growing collection
of large, heterogeneous databases of legislative and legal information which
includes the fulltext of the Congressional Record, Bills and Laws; Committee
Reports and other documents; and Congressional Research Service products
(e.g., Bill Digests).
Besides fulltext ASCII (searchable by boolean and
relevancy-ranked queries) the expanded LIS will support SGML-tagged documents,
PDF, and audio and video format standards for various data sets.
1.3 Metadata and Variants
A variant specification, metadata element (e.g. from tagSet-M or tagSet-G),
or GRS-1 metadata field, may apply at any node (leaf or non-leaf) of the
digital object tree. For any given such metadata component type -- a variant
specification of a given class and type, metadata element of a given tag, or
specific type of GRS-1 metadata -- the rules of applicability and inheritance
are as follows; for any leaf node:
- If any such component is attached, it applies to that leaf (and that
leaf only).
- If no metadata component of a given type is attached, then the most
immediately superior occurrence, if any, of a metadata component of the
same type applies; that is, among the superior subtrees with a component
of that type attached, the one attached to the most subordinate subtree
applies.
1.4 Representations of a Digital Object
A single digital object may have several representations, for
example a "thumbnail", "highly compressed", "high resolution", "original", or
"reference image". When these characterizations apply to the digital object as
a whole, they are represented as (Z39.50) variants applied at the root of the
object tree (i.e. at element 'root' of datatype Object; see 3.1.1).
Representations may also apply at nodes subordinate to the root, and the rules
of applicability and inheritance stated in 1.3 apply.
[Note: there is currently a proposal to add a new feature to
variant-1, necessary to support this.]
1.5 Z39.50 Access to Digital Objects
As in the Collections profile, a digital object may be accessed via Z39.50
or via some other protocol. For the DL profile however, when a digital object
is accessible via Z39.50, it may be accessed via its Object Descriptive Record
only. The use of Z39.50 to search or retrieve a digital object directly (not
via its Object Descriptive Record) is not supported by this profile.
[remainder of profile not available in html]