UMI Announces Ditigal Vault Initiative

Article excerpt

Users who have toiled over microforms to access literary works, historical documents, and old periodicals will cheer at the recent news from UMI that it will convert its vast microform collection to digital format. At the American Library Association Annual Conference this summer, UMI announced its plan to create the world's largest digital archival collection of printed works by scanning the contents of its microform collection covering 500 years of information.

That collection contains hundreds of thousands of books, newspapers, periodicals, and other materials stored in three temperature-controlled vaults at the company's headquarters in Ann Arbor, Michigan. UMI is calling this massive conversion from microform to electronic format the Digital Vault Initiative.

The Conversion

Scanning of the 5.5 billion pages of images began in May and will continue over the course of several years. This project will be in addition to the 37 million images of contemporary information that UMI adds to its existing digital collection each year.

According to Jeff Moyer, director of the Digital Vault Initiative, UMI is using five state-of-the-art digital scanners and is working 24 hours a day with three employee shifts. For the project, quality control editors will check each page image and will be separately indexing each illustration contained in documents. The scanners are creating page images and not doing OCR of the text itself. The scanned works will be linked to full bibliographic information in MARC records, which will be fully searchable.

The first phase of the Digital Vault Initiative will focus on UMI's collection of early English literature, including nearly every English-language book published from the invention of printing in 1475 to 1700. This collection, begun in 1938 as UMI's first microfilm project, includes such works as Chaucer's The Canterbury Tales, Culpeper's The English Physician, and Shakespeare's renowned First Folio edition of 1623.

This collection comprises 9,600 individual titles with 22 million page images. UMI estimates that the collection represents about 75 percent of all books published in English during those years. A prerelease of this phase should be available by the end of this year, with the full contents of this collection available by June 1999. These literary and historical works will be searchable on the Web as a separate database initially, and eventually will be searchable in the ProQuest Direct online service.

The company will be working concurrently on scanning the archives of the top 50 periodical titles, defined by microform sales. This will include key titles like the full run of Time magazine, dating from the first issue in 1923. This should also be completed by June 1999. Next to be digitized will be full runs of important newspapers such as The New York Times, The Wall Street Journal, and the Chicago Tribune. The digitized periodicals and newspapers will be integrated into the ProQuest Direct service. …