I'll admit it. When I first read the February 1993 CHRONOLOG description of DIALOG's new RANK command, I was confused. I was awed. I was glad. And I was intrigued.
Clearly, RANK is powerful--more powerful than ORBIT's GET command when it was explored in this space three years ago . After studying several examples of how to use it (competitive analysis, market analysis, trend analysis, patent analysis, locating experts, and "improving search results"), I knew that I could find uses for RANK.
Multiple options for use with RANK breed confusion:
* adding titles * receiving continuous output * sorting in alphabetic order * sorting in descending order * obtaining detailed percentages, total database counts, and file-specific counts * combining ranked numbers * use in OneSearch
Faced with all these options, I wondered if I would ever be able to understand the process, make appropriate choices and interpret the display. The potential and complexity of RANK is enough to paralyze action among foot soldiers. How could I best harness the power of this command to achieve results?
I decided to concentrate on uses of RANK with which I was already familiar. If I could understand the modest applications first, then I might proceed to test the really powerful advanced uses. I tried RANK on three questions in connection with a search I was doing on the renewed spread of tuberculosis (TB) in the United States. This column presents the candid results of those first trials.
SEARCH 1 USING RANK TO SELECT SEARCH TERMS
To survey recent medical research on the spread of tuberculosis, I planned to use MEDLINE and Health Planning and Administration databases. I am not a medical specialist, and I do not have ready access to the printed MeSH subject headings. Both databases have an excellent online thesaurus, which I am accustomed to using to select the correct highly-controlled search terms. First I tried to find the best search terms in my usual way--by EXPANDing on a likely basic search term and paging through the display (Figure 1).
TUBERCULOSIS --EPIDEMIOLOGY --EP and TUBERCULOSIS --PREVENTION AND CONTROL -- PC emerge as the likely search term candidates. But wait! What's the rest of this display?
E46 0 2 TUBERCULOSIS VACCINES
E47 24 8 TUBERCULOSIS, AVIAN
E48 1 TUBERCULOSIS, AVIAN
I entered P for more. I kept entering P for several minutes while I discovered just how many types of tuberculosis there are: avian, bovine, cardiovascular, cutaneous, etc. There were all the descriptors using the term "tuberculosis," each followed by the full range of linked subheadings, including Epidemiology and Prevention And Control. It took almost ten minutes to show the full range of all related descriptors. (Yes, of course I would have abandoned this technique if I had not been trying to gauge just how much online time the use of RANK might save.)
Next, I performed a free-text search using the terms that my client and I had already selected. I SELECTed tuberculosis (s) (therapy or treat? or control? or interdict? or contact? or spread or epidemi?). Receiving 9,989 postings, I then issued the command to RANK the set by descriptors and anticipated seeing a display of the most appropriate search terms.
I immediately realized my tactical error: RANKING takes time. DIALOG reported on its progress in RANKing by issuing confirmations every 500 records. Still, I should have cut the time down by limiting my RANK set by such criteria as publication year, English language, Human, Abstracts, or Major descriptors.
Lesson learned: Work with the smallest possible set.
Eventually, all 9,989 records were crunched and I was rewarded with the display in Figure 2.
What went wrong? DIALOG says that RANK words with phrase-indexed Additional Index fields, and my Bluesheets for Files 154 and 151 show that the DEscriptor field is phrase-indexed. …