10 Ways to Improve Data Quality: With a Coordinated Effort, Your Library Can Make Significant Progress in Cleaning Up Its Online Catalog
Beall, Jeffrey, American Libraries
Have you ever found a book listed in your online catalog but couldn't find it on the shelf, even though it wasn't checked out and you looked for it over a period of weeks? Or have you had the experience of finding a book in the stacks that contains the information you were seeking, and you wonder why it didn't turn up in the searches you'd done? Both of these scenarios can occur when data is not properly maintained in library catalogs or when steps haven't been taken to ensure the books in the stacks match what's listed in the catalog. In many libraries, these glitches are often overlooked because they are no one's responsibility.
Dirty data in online library catalogs--including data with typographical errors, misspellings, and incorrect, excess, or missing information--can severely hamper patrons' access to library materials. Access is restricted when a catalog searcher does not retrieve sought-after items that would otherwise turn up in search results, or when a user retrieves a large number of false or duplicate hits. For example, if a patron is looking for a book about nutrition and doesn't know the author or title, and the subject heading on the bibliographic record for the book is misspelled as "nutritution," the record will not be included among the search results and the user may never find the item.
Bad and poorly maintained data in library catalogs causes lost time, money, and information. Time is lost when materials can't be located due to missing or incorrect data in the catalog. These materials may then have to be ordered through interlibrary loan, which costs money. And information is lost when it's not accessible in the catalog, or when listings for books and other materials are not retrieved in searches, but they would have been retrieved if the data had been entered and maintained correctly.
Most of what I describe here represents standard quality-control practice in library technical services. However, because libraries often emphasize the fancy technological aspects of integrated library systems, we sometimes neglect to pay sufficient attention to the intellectual content and clerical accuracy of those systems--that is, the bibliographic data and its maintenance.
Bad data in an online catalog is the modern equivalent of a missing card in a card catalog, and although most of what I recommend is tailored to fully automated catalogs, the principles apply to all libraries. Whether catalog data is stored on catalog cards or on a computer server, it should be fully standard, up-to-date, complete, and accurate.
Low-quality data in library online public access catalogs is a topic not often discussed among librarians, but many of us have seen glaring typos and other errors when using our catalogs. Many databases are riddled with mistakes. Even the Library of Congress's catalog has errors. Other professions strive to reduce and eliminate mistakes, and librarianship should be no exception.
Fortunately, with a coordinated effort, a library can make significant progress in cleaning up its online catalog. To achieve this progress, I recommend that libraries take the 10 steps listed here. The first eight steps represent sound technical services practice and include tasks that are part of intake procedures, that correct past errors, that arise from changed and updated cataloging standards, and that involve general data cleanup, in that order. The final two steps are tasks not specific to technical services, but goals that all library staff can work together on.
1 Perform authority work on all controlled headings. This means checking to make sure every heading--authors, subjects, series--agrees with the established form on the authority record. Authority control can be manual or automated. It should be done for the headings on all new items entering the catalog.
2 Follow national cataloging standards in all aspects for all library materials. Doing this helps ensure collocation of materials in the online catalog and consistency in retrieval. …