Magazine article Information Today

Under Advisement

Magazine article Information Today

Under Advisement

Article excerpt

This month, we're turning the spotlight on The Information Advisor (IA) for the first time. Published monthly and edited by Robert Berkman, IA also has a four-page quarterly supplement called Enterprise 2.0: Managing Knowledge in Your Organization (March, June, September, and December). Since we're looking at the December issue of IA, you'll get a peek at the supplemental material too.

Improved Visibility

The front-page story was the second half of a two-parter written by Berkman called "The Invisible Web Getting a Little More Visible." The first installment identified substantive searchable government databases that provide information that is not so easily retrievable from a search engine. In Part 2, Berkman takes a different tack, initially focusing on what the terms "invisible web" and "deep web" have come to mean in the current search realm. Actually, he says that the original definition (parts of the web that are "UN-retrievable from standard search engines") remains the best one. He then looks at what content is still difficult to find using a search engine, sharing sources and strategies that help to uncover hidden sites.


The accompanying table in the article identifies 14 of those sources/ strategies, from INFOMINE to Trove, Intute, and Google Scholar. Berkman explains what each source is (directory, search engine, scholarly/ citation search, federated search, or digital guide) and offers a brief description of its coverage/scope. For example, IncyWincy is a search engine that covers "hundreds of thousands of search engines and 200 million webpages." (If that's "IncyWincy," I can't imagine what a "big" search engine would cover.)

Berkman provides insight into what is no longer invisible. The most significant of the now findable content includes non-HTML files such as PDFs, PowerPoints, and Word documents. "The ability to index these files has been an enormous boon for searchers," according to Berkman. "Not only has it made more of the web findable, but limiting a search to these files is often an excellent strategy for retrieving pages that are more likely to be substantive and datarich." Other content that is now retrievable includes academic/scholarly material, dynamically generated content pages, images such as scanned books and patents, news archives, and real-time information.

Though many of these areas on the web are now findable, much of the web remains invisible. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.