SmartSearch: Automated Recommendations Using Librarian Expertise and the National Center for Biotechnology Information's Entrez Programming Utilities

Article excerpt


Librarians are in the recommendation business. Our customers rely on us to recommend what they should read, which database is preferable over another, or which textbook might answer a background question. As digital gate counts increase and outpace traditional face-to-face interactions [1], the need to integrate librarian recommendations into digital systems grows. SmartSearch represents an automated approach to offering digital expert guidance to customers.

The Lane Medical Library & Knowledge Management Center provides information access and knowledge management services for the Stanford University School of Medicine, Stanford Hospital, and the Lucile Packard Children's Hospital. Lane's mission is to get the right knowledge, to the right person, at the right time, in the right context to support translational research, innovative education, and advances in patient care. This is largely accomplished via the LaneConnex web interface [2], a library search platform that performs a metasearch across hundreds of licensed and open access knowledge resources.

Like many academic health libraries, Lane's clinical collection consists of thousands of electronic journals and textbooks. This wealth of knowledge is daunting to users who are often overwhelmed by the sheer quantity of information. Lane's usage statistics show that clinical users consistently overlook expensive clinical resources (e.g., specialty textbooks from AccessMedicine, MDConsult, and Ovid) that librarians have selected for their high value and clinical relevance. SmartSearch addresses this issue.

The goal of the SmartSearch project is to recommend a small number of infrequently consulted, highvalue, clinically relevant resources in the context of a standard LaneConnex search. SmartSearch is a resource promotion tool that leverages librarian expertise with the Entrez Programming Utilities (E-Utilities) [3] from the National Center for Biotechnology Information (NCBI) to mimic the information-seeking behavior of a typical reference librarian. System design, development, and an evaluation of its effectiveness will be described.


The SmartSearch design team consisted of two biomedical librarians, two software developers, a web production specialist, and an interface designer. In designing a recommender system to return "optimally appropriate" clinical resources for a given user query, three general approaches were considered: item-to-item correlation, people-to-people correlation, and attribute-based recommendations [4]. Given an absence of usable user preference data, itemto-item and people-to-people approaches were quickly discarded. Approaches based on popularity alone were also problematic because the resources targeted for recommendation were so lightly used. Given these constraints, an attribute-based recommendation system was selected.

SmartSearch was first deployed in November of 2007. Results appear in the recommendation area of LaneConnex, an area also used for spelling suggestions and exact journal title matches.

Handcrafted Medical Subject Headings (MeSH)-to-resource maps created and maintained by Lane librarians drive SmartSearch recommendations. Recommendations are drawn from a pool of 156 clinical metasearch targets. A metasearch target is a remote resource that Lane's metasearch application searches at either the individual title level (e.g., Abeloffs Clinical Oncology or The American Journal of Bioethics) or across an entire collection (e.g., Clin-eguide, Micromedex). Librarians selected 130 of the 156 metasearch targets to be included in the SmartSearch project. Approximately 85% of these 130 resources are individual clinical textbook, handbook, or atlas titles. To build maps of MeSH terms to recommended resources, Lane librarians were initially aided by the detailed descriptive work done by Lane's cataloging staff. Each resource was described in the Lane catalog using MeSH, and a list of all metasearch targets was then extracted and sorted by MeSH term. …