So Much Information, So Little Time: Evaluating Web Resources with Search Engines

Article excerpt

The Internet is one of the youngest and fastest growing media in the world. Its growth is still accelerating at a rate of about 7.3 million pages per day, doubling every eight months (Murray and Moore 2000), indicating that the Internet has not yet reached its highest period of expansion. Buried in this vast, quickly growing collection of documents lies information of interest and use to almost everyone. The trick is finding it. An abundance of search engine tools can be used to retrieve information from the World Wide Web. Search Engine Watch (2001) reports that more than 75 search engine tools are available and provide links to many relevant sources. Many tutorials are also available online, providing details on search engine tools and guidelines for effective searches.

Students are using the Web as a source of information for both educational and personal topics. To conduct an effective search, students must understand the structure of various search engines. Not all search engines are created equal. Many do not always provide the right information, often subjecting the user to an influx of disjointed and irrelevant data. Without systematic instruction in information literacy, students cannot realize the potential of the Web. Students knowledgeable of both effective search strategies and criteria to evaluate search engine results can enhance their current studies and, more importantly, continue to update their knowledge through access to relevant and useful resources readily available on the Web.

The Internet is a valuable resource, but it should be used with caution. The motto "An educated consumer is the best customer" is aptly applied to the Internet. To move toward the goal of creating "educated information consumers," we developed a hands-on exercise that was used in an introductory management information systems (MIS) course. The remainder of this article describes the exercise, including an introduction to information retrieval and a discussion about our experience. The lesson in information retrieval focuses on answering the following three questions:

1. Do all search engines find the same information?

2. How can we judge the retrieval effectiveness of these results?

3. Why do we get different results using different search engines at the same time, or the same search engine at different times?

Information Retrieval Concepts

We introduce students to the concept of retrieval effectiveness through a classroom demonstration. We ask three students to suggest different search engines, then using those search engines we perform a search. For example, we want information on mobile commerce in healthcare management, so we conduct a search using the terms "mobile commerce" and "healthcare management." We ask students to note the number of sites returned for each engine. Analyzing this number shows the large variations in the manner that various search engines retrieve sites. Preliminary results were: Google returned 19 sites; LookSmart returned five directory topics and 2,000 sites; and Metacrawler returned three sites. Clearly, the number of sites returned varies substantially. This can be attributed to the fact that different search engines are used, as well as the fact that each engine belongs to a different category of search tool.

We then introduce students to the concepts of recall and precision. Recall and precision are two of the most widely used measures of information retrieval effectiveness. Recall measures how well an engine retrieves all the relevant documents, whereas precision measures how well the system retrieves only the relevant documents (Blair and Maron 1985). Relevancy, for the purposes of this exercise, is defined as whether or not a site is deemed relevant by the user who initiated the search. Otherwise, the page is noted as irrelevant.

Figure 1, above, graphically represents the concepts of recall and precision. …