Search Patterns of Remote Users: An Analysis of OPAC Transaction Logs

Article excerpt

The focus of this study is the search behavior of remote users of the University of California MELVYL Library System, an online public access catalog (OPAC). Transaction logs from randomly selected remote user search sessions are analyzed. Descriptive data on the number and type of searches, choice of search mode and database, number of retrievals, number and type of errors, and use of system HELP facilities are presented. The search data have been cross-tabulated with demographic data on the same group of remote users, collected through an online survey conducted by the authors. Effectiveness of system usage is discussed. A case is made for the desirability of additional heuristics in the catalog portion of the system.

The MELVYL Library System of the University of California (UC) first became accessible outside the library setting in the mid-1980s. Remote usage has risen steadily since that time and typically accounts for more than one-third of the half-million queries entered in the system each week during peak usage periods.[1] In an effort to understand more fully this growing user population, the investigators undertook a two-part study. The findings of the first part of the study, an online survey of users who accessed the MELVYL system from outside the library setting, were reported previously.[2] The present report contains the results of the second phase of the study. In this phase, the investigators coded selected data from the transaction logs of the surveyed group, used microcomputer programs to compare those data by user status and other user characteristics, then visually reexamined the user command portions of many of the individual logs to gain further insight into user search behavior.

THE MELVYL LIBRARY SYSTEM

The MELVYL system provides access to nearly eight million monograph and periodical titles held principally by libraries of the University of California.[3] In addition, the system offers its users access to several periodical index databases and serves as a gateway to many other specialized databases and library catalogs. Users may access this rich array of resources directly from their homes, offices, or other sites, through dial-up or networked connections.

The MELVYL system began as a prototype online catalog for the University of California, a nine-campus, doctorate-granting institution, which currently supports a main library on each campus, nearly one hundred branch and specialized libraries across the system, and an enrollment of more than 166,000 students. The system serves as a union catalog, to which the campus cataloging agencies contribute their records. Most of the campus libraries implemented local online catalogs during the 1980s; these serve as their primary catalogs and as a gateway to the UC union catalog residing within the MELVYL system.

After a decade of development, the catalog portion of the MELVYL system has achieved the status of a second-generation OPAC. It generally reflects Charles Hildreth's hypothetical construct of features that constitute "a qualitative leap of progress over first-generation online catalogs."[4] For example, the system supports keyword access to a variety of fields, explicit Boolean search logic, limiting capabilities, optional and automatic truncation of search terms in some kinds of queries, extensive help facilities (including contextual help screens), and multiple display formats. Some examples of special processing introduced to improve retrieval are the "normalization" of search terms in several fields and the treatment of title words as "exact" titles under certain conditions.[5] Appendix A contains a summary description of system commands and indexes.[6]

Hildreth states that researchers involved in information retrieval generally acknowledge that "today's conventional keyword-indexed, inverted file, Boolean logic search and retrieval systems like BRS, DIALOG.... LEXIS-NEXIS (and all second-generation OPACs) are powerful and efficient but are dumb, passive systems which require resourceful, active, intelligent human searchers to produce acceptable results. …