The replacement of the card catalog by the online catalog brought with it a great resurgence of interest in the problems of subject access in general. This is hardly surprising in view of the fact that the online catalog promised to offer subject search capabilities that were substantially better than those offered by its predecessor.

Many studies on how to improve subject searching in online catalogs have already been performed. The approaches most frequently investigated can be grouped into five broad categories:

1. Those that rely on improved or more

flexible approaches to the searching of

elements (e.g., subject headings)

already commonly searched.

2. Those that extend search capabilities

to more elements in existing bibliographic


3. Those that would enhance existing

bibliographic records by adding further

searchable elements.

4. Those that would make further

searching aids available to the library


5. Those concerned with usefully limiting

the number of records retrieved in

simple search approaches (e.g., single

keyword in title) that would otherwise

cause an unacceptably large retrieval

from a database of any significant size.

Examples of the first group include studies involving improved word-stemming, techniques for the approximate matching of words (e.g., phonetic spelling), and the ability to perform keyword searches on subject headings (e.g., Walker,[1] Walker and Jones,[2] and Lester[3]). The second group, also exemplified by Lester,[4] looks at complete bibliographic records and determines how much retrieval would be improved were all fields equally searchable.

The third group recognizes that subject access might be improved considerably were existing bibliographic records enhanced by the addition of further access points taken, for example, from tables of contents or back-of-the-book indexes. This approach can be traced back some years (e.g., Atherton,[5] Wormell[6]. Recently, Byrne and Micco discovered, not surprisingly, that greatly improved recall could be obtained when MARC records in a database were enhanced by adding to each an average of twenty-one multiword terms drawn from indexes and tables of contents.[7] Using a somewhat different approach, Diodato confirmed that terms used by readers to described books do tend to match terms occurring in indexes and tables of contents.[8]

The fourth group of studies looks at the effect of making additional searching aids available to catalog users. Bates proposes two such tools that could be used in existing catalogs based on Library of Congress Subject Headings (LCSH) (an end-user thesaurus - basically a vast entry vocabulary - and a semantic network, incorporating the entry terms, that allows a searcher to select from a variety of methods for generating semantic associations), but she does not actually test them.[9] The most obvious searching aid would be a subject authority file, incorporating cross-references. Lester found that such an authority file had relatively little effect on the ability of catalog users to match their subject terms with LCSH headings, while Van Pulis and Ludy found that subject authority files are little used even when made available online.[10],[11] Jamieson et al. have compared the value of the authority control approach with the ability to perform keyword searches in complete bibliographic records.[12]

Many keyword searches in large online catalogs would be successful in the sense that they would retrieve relevant items. But they would also retrieve substantial numbers of irrelevant items, and would bring out so many records that the user would be discouraged from proceeding further. …

