Building Digital Audio Preservation Infrastructure and Workflows: UW Libraries Special Collections Holds More Than 3,000 Audio Recordings of Pacific Northwest Indigenous Languages. Format Migration Was Critical Because Many of the Languages Were Extinct or Endangered

Article excerpt

In 2009 the University of Washington (UW) Libraries special collections received funding for the digital preservation of its audio indigenous language holdings. The university libraries, where all four authors work in various capacities, had begun digitizing image and text collections in 1997. Because of this, at the onset of the project, workflows (a metadata implementation group for help with schema development, storage architecture, schedules for backup, and procedures for recovery) were already in place. In addition, ResearchWorks Archive, an institutional repository built on the DSpace software platform, has been in place since 2003. Building a digital collection for noncommercial audio recordings, however, required additional infrastructure and workflows: In addition to the established workflow for metadata creation, recordings needed to be converted from analog to bitstream format. Also, a staging and ingest plan for the digital files and associated metadata had to be developed. This article addresses the development of this new infrastructure and workflow.

[ILLUSTRATION OMITTED]

Background

UW Libraries special collections holds more than 3,000 audio recordings of Pacific Northwest indigenous languages. These recordings, dating from the 1950s through the 1990s, document the stories, vocabulary, grammar, and songs of more than 50 native dialects. The reel-to-reel and cassette tapes, magnetic formats that have limited life expectancies even when carefully stored, were rapidly degrading. Format migration was critical because many of the languages on the recordings were extinct or endangered.

Special collections secured funding to conduct a survey of the libraries' Pacific Northwest (PNW) audio holdings. Partial funding for the project came from the Jacobs Research Funds, an organization based in Washington state that grants funds to linguists conducting field research. Once the survey was complete, prioritization for format migration was developed based on the intellectual value of recordings as determined by UW linguistics faculty member Sharon Hargus and Jacobs Research Funds board member Pamela Amoss. Once priority had been assigned for format migration, it was determined, based on current best practices as outlined by Sound Directions (a project of Indiana University; www.dlib.indiana.edu/projects/sound directions), that digitization would be the method of format migration undertaken. Funds for digitization of the files were provided by an outside source, the Salish Research Foundation.

[ILLUSTRATION OMITTED]

Metadata

Metadata development for the project divided into item record metadata and bitstream metadata. During the overall metadata planning phase, the Pacific Northwest curator, Blynne Olivieri, worked closely with the libraries' Metadata Implementation Group (MIG), coordinated by Theodore Gerontakos. Throughout several meetings, the curator and the MIG advisors discussed the need for rules to govern the creation of metadata that would be associated with each Broadcast Wave Format (BWF) file. These meetings also addressed detailed aspects of metadata planning, which included consistent file naming, determining the unit to be described by a metadata record (a choice between the multitracked analog original or the digital file created from a single track), incorporation of appropriate standards, and metadata compatibility with DSpace, the chosen software platform. Another significant element of planning was to ensure that adequate descriptive metadata would be created to allow the audio and its metadata to be successfully repurposed outside DSpace, which, in this phase of the project, was an administrative and preservation location, not a public access site. The curator then created a metadata schema, which served as the guide for the metadata structure, content, and rules.

The final schema for the item record metadata contained 18 essential Dublin Core elements or fields. …