Tuneups: Search Engines Need Help
Margot Williams 1996, The Washington Post, St Louis Post-Dispatch (MO)
Aimless exploring of the World Wide Web is fun, but most times directions are welcome.
Search engines such as InfoSeek, Lycos, Excite, Alta Vista, Yahoo, Open Text and the new kid HotBot (and many more lesser-knowns) are the piloting tools that aim to bring order to the Web. They're meant to make it possible to find that needle of information in the electronic haystack.
Many are compiled by tireless robot programs, sometimes called spiders, that roam the Web. With such nicknames as Slurp and Scooter, they pursue paths to and through public Web sites, taking note of what's there, and returning home with text to add to the search engines' databases.
But as the Web and the databases expand, it's getting harder and harder to find what you want. Type a single keyword into a search engine and you might be directed to tens of thousands of pages. But with some smart searching you can narrow it down.
As an example, let's take HotBot, which started up in May with an index of 36 million pages and is set to move to more than 50 million, according to Kevin Brown, marketing director at Inktomi Corp. which developed the technology behind the engine.
Enter the word census and 120,710 pages match.
But the service did give a ranking based on what the system's software believes are the most relevant. Documents score high if the keyword is in the title or first few words of the pages, and if they contain more repetitions of the word than other pages do.
The list displays each document's title and the first snippets of text, along with the URL (Uniform Resource Locator or Internet address). Those listed as the top 10 all had "census" in the title and also in the initial text; but only one, No. 6 - an obscure page from the U.S. Census Bureau's site-was close to what was sought.
That was a dumb search. How was the database to know what I really wanted? Trying again with two words, census bureau, I got a mere 43,075 hits, and that page from the bureau's site had been promoted to hit No. 2. Then I read HotBot's help file and found instructions for lots of features that make it easier to focus a search.
Putting quotation marks around my two words narrowed the results to 19,057 hits, as the engine looked for documents with those words next to each other. There's a pull-down menu that lets you further tighten a search.
There also is HotBot's "expert" query option for more precision searching, with the ability to restrict searches by date, domain name and media types like Java, audio or virtual reality files. …