Magazine article Information Today

What's in a Space? Find out Now

Magazine article Information Today

What's in a Space? Find out Now

Article excerpt

Phrase searching or compound term searching in online and CD-ROM databases has been enigmatic for many casual searchers. Carol Tenopir1 and Marydee Ojala2 have written about the differences of various systems in interpreting search phrases. There have been two extremes in interpreting user queries that consisted of two or more words separated by a space.

The Ultrarestrictive Interpretation

DIALOG represented one extreme. It interpreted the query "select INFORMATION STORAGE" as the equivalent of saying "select all those records from the database where INFORMATION STORAGE is the exact descriptor or identifier."

It did not find records where the title or abstract or full text included the words "information storage" next to each other in this order. The words had to be in the descriptor or in the identifier field (in databases that had an identifier field). Not only that, but the descriptor or identifier had to be exactly in this format, i.e., descriptors/identifiers such as INFORMATION STORAGE TECHNIQUES, or INFORMATION STORAGE & RETRIEVAL did not qualify as a match. In order to retrieve records that include this compound term from the title, abstract, and full-text fields, or as a part of a longer descriptor/identifier, the user had to enter "information (w) storage." This is counterintuitive and illogical even for seasoned searchers, as compound terms such as drug abuse, United States, deficit reduction, New Mexico, or even attention deficit disorder just flow from the fingers of someone who is making a search.

In a library that has databases sporting various search software programs, this is especially confusing and illogical for the patrons. Using a database with the UMI software, the query "information storage" retrieves records where the two terms are next to each other in this order in fields that are used to create the basic index. In SilverPlatter databases, the software retrieves records where the component words are in any order. While this latter may backfire by retrieving both school library and library school, or surgeon general and general surgeon, these lenient approaches are much better than the ultrarestrictive interpretation of DIALOG, especially in indexing or abstracting-andindexing databases. In databases running with the EBSCO software, it retrieves records where the terms are not more than 20 words apart, and the proximity can be changed. Database-dependent stopwords in some software further mystify the issue for the casual users.

Now, finally, DIALOG has changed the excessively restrictive interpretation of the space character by the introduction of the FIND command. It can do all the tricks that the SELECT command can (but not SuperSelect), and it interprets such queries in a common-sense way: words next to each other in this order, i.e., find INFORMATION STORAGE will retrieve records where this pair of words occurs in this order in any of the fields that are used to create the basic index, including descriptors and identifiers that have this compound term as part of longer ones. This is intuitive. This is reasonable. I had been begging for it for 20 years privately and publicly, and I had given up on it as Columbus had given up on finding India after being marooned on the shores of Jamaica. Now this feature arrived without fanfare, to be explored by yours truly on Columbus Day.

The Ultralenient Interpretation

Interestingly, the other extreme interpretation-still in existence, by the way-comes from DataStar, KnightRidder Information's other online service. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.