Testing the Accuracy of Information on the World Wide Web Using the AltaVista Search Engine

Article excerpt

In this study we examine the accuracy of the World Wide Web for answering general ready-reference questions. We used the search engine AltaVista for this study. We gathered ready-reference questions over a two-week period and then searched on the Web for answers. We assigned accuracy values to each Web site retrieved, based on its relation to the required answer criteria. The average number of duplicate sites retrieved per search was 4.02. The percentage of dead links returned was 12.4 percent. The average number of times per search that a search was reworded and reentered (per search) because the first search retrieved no hits, was 1.77. Lastly, if an answer was found at all, there was a better chance of it being correct or mostly correct than wrong or mostly wrong.

We live in the "Information Age" in which words such as Internet, World Wide Web, and Information Superhighway are heard daily, whether in conversation or via the media. Even so, many people are unsure what these terms are, much less how they could benefit from their use. However, a growing number of individuals are realizing the need to learn the skills necessary to use the Web and the Internet, not only to advance in their jobs, but also to function in tomorrow's world.

While they have become virtually interchangeable in our vocabulary, the Web and the Internet are very distinctive entities with different functions. The Internet is a "vast, worldwide network of computer systems that enables users to communicate with one another via electronic mail, via Telnet (a process that permits users to log-in to a remote computer), and via the File Transfer Protocol (also called FTP, which allows users to transfer information on a remote host to their local computer)."(1) The Internet is command-driven and does not lend itself to graphical displays, such as photographs, illustrations, and charts.

On the other hand, the World Wide Web (also known as WWW or the Web) is an "interlinked collection of documents or Web sites that reside on a server computer."(2) It is built around a hypertext system that provides links to other Web sites, which can be accessed randomly by clicking on the selection with the mouse.(3) In other words, the Web has been designed to be intuitive and graphical, thereby making the Internet less complicated to use. Although many people have heard of them only recently, the concepts for the Internet and the Web are not recent phenomena.

The Internet originated in the 1960s as a decentralized governmental tool that could be used to reroute messages in the event of a nuclear attack.(4) Research on the Internet continued, and in 1989 the idea of the World Wide Web was conceived at the CERN Research Center in Geneva.(5) The scientists in Switzerland implemented their ideas by developing the technology needed to make the hypertext system possible. Since then, both the Internet and the Web have evolved into informational tools that can be accessed by anyone, virtually anywhere in the world. Not only can anyone access them, but any person, organization, or business with the proper computer hardware and software can also publish material on them, regardless of content. As a result, the validity and accuracy of information being disseminated through the Internet and the Web have been questioned. Librarians in particular have speculated about the accuracy of this information. Healey noted:

   The veracity, accuracy, and objectivity of materials is a classic problem
   for libraries, but the expense of publishing, combined with an extensive
   reviewing system and the relatively "fixed" nature of printed materials all
   help librarians to find quality materials, and avoid shoddy, biased, or
   misleading works. With the Web (and with the Internet as a whole), this
   system is not yet in place. Precisely because it is so easy to publish a
   document on the Web, and so easy to change a document that is already
   published, librarians will have to evaluate Web resources to assure the
   quality of the information provided. …