Home > Technical Writings > Digital Talking Books, Planning for the Future > Activity Planning

NLS Technical Writings

Digital Talking Books, Planning for the Future

July 1998

prev --- next

Activity Planning

While NISO continues the development of a DTB standard, NLS will test digital methods and build expertise on topics directly relating to DTB features. This research will involve "hands-on" testing of relevant software, possibly including some patron evaluators. Examples of testing activities are summarized below.

Test and report on methods to vary the playback rate while maintaining original pitch

This process allows users to speed up or slow down talking books without the voice becoming very high or low in pitch. Several algorithms will be considered, such as Cool Edit (audio editing software reportedly used in the European Digibook project), KBX96000 (real-time controlled hardware under development at Discrete Time Systems Ltd.), and a software signal processor (Entropic's system, which is restricted to NT and UNIX hosts). Effective real-time control is needed for patron evaluation, but it is not yet available. The promised but as yet unrealized MPEG4 standard may also be of interest here. An ideal test system would allow a patron evaluator to change playback to any rate between half speed and triple speed, yet maintain pitch and intelligibility.

Test and report on state-of-the-art audio coding and decoding algorithms for efficient storage of spoken audio

For economy and acceptance, emphasis will be on algorithms most likely to become standards in the consumer entertainment market. Examples of coder/decoder algorithms include MPEG and AC3, systems that permit ten-to-one data reduction with no perceptible loss of fidelity. Although it poses significant programming and control problems, integration of decoders into multimedia presentation software is essential.

Test and report on alignment of text with audio to provide efficient text search of spoken audio

Two programs are reported to do this: one by IBM is embedded in a large workstation system; the other, from Entropic, runs on an NT or UNIX host. Both algorithms are consistent with file structures discussed at the first NISO meeting. This technology promises to automate indexing of spoken audio by creating a file that links it to searchable text.

Test and report on alternative controls of multimedia software

One possible approach is to experiment with simple programmable remote controls. This strategy allows development of user controls, particularly verbal feedback, independent of playback technology.

Test and report on state-of-the-art text-to-speech algorithms

One example is Microsoft's Whistler synthesizer, which is said to be "natural." (Samples definitely sound less robotic than the standard DECtalk.) Since the algorithm is available only in C++ source code, we will need a compiler to support a suitable evaluation.

Test and report on digital recording, editing, and duplication methods using digital audio tape (DAT) and direct-to-disk recording

We have recorded one book on DAT in the NLS studios, and we are examining specifications and performance reports on direct-to-disk systems, such as those offered by Telex and Otari.

Test and report on the use of off-the-shelf multimedia authoring and presentation software for representing DTB segments

This software is interesting because of its widespread commercial use, inclusion of user controls, and variety of data types supported. Examples include Macromedia's Director and Asymetrix's Toolbook.

Evaluate products from Plextor, the DAISY Consortium, Recording for the Blind and Dyslexic, and other sources as they become available

John Cookson, Head, Engineering Section

--- top

prev --- next

prologue --- planning --- NISO --- activity planning --- 20 steps --- 9 tasks --- consumer involvement

bibliography --- appendix i: details in implementation --- appendix ii: overview of contracting approach


Library of Congress Home    NLS Home    Comments about NLS to nls@loc.gov

About this site    Comments about this site to the NLS Reference Section

Posted on 2006-05-30