Evaluation of Six Google Search Features
Zhang, Jin, Fei, Wei, Le, Taowen, Online
Knowing the retrieval effectiveness of advanced search features on Google is imperative for information professionals because Google is the default search engine for most people. It runs more than 1 million servers in its data centers around the world to host a huge amount of information (www.pandia.com/sew/481-gartaer.html). Google processed more than 1 billion search requests and 20 petabytes of user-generated data daily in 2009 (http://politicalticker .blogs.cnn.com/2009/12/18/google-unveils-top-politicalsearches-of-2009).
Although Google provides a wide spectrum of search services, such as web search, image search, video search, news search, book search, Google Scholar, product search, and Google maps, web search remains one of the most used. Google also provides searchers with an array of rich search features; you can search Google by regular means (entering words into the search box) or use advanced search features. However, how Google's search results correspond to these advanced features is an unsolved mystery.
We investigated the retrieval effectiveness of six search features of Google - title search, regular search, exact phrase search, PDF search, anchor restriction search, and URL restriction search. We tested search performance by running 120 queries, 20 per search feature, spread equally among four broad subject domains - medicine and health, culture and education, information and technology, and business and economy.
Although our proposed hypothesis was that there is no significant difference in retrieval effectiveness among the six search features, we found significant differences. PDF search outperforms the others, achieving the best performance in this study. Title search surpasses all but PDF search. While regular search, URL restriction search, and anchor restriction search rank the third, fourth, and fifth, respectively, the exact phrase search shows the poorest performance in this study.
STUDYING ADVANCED SEARCH
Various studies have been conducted to analyze search engines' search features. One study selected two advanced search features (exact title search and URL search) to examine the impact of nontopical terms and semitopical terms on query expansion. The study revealed that search results would improve if queries were restricted to the exact title search or URL search.1 Contrary to expectation, in another study, Boolean search strategies delivered negative retrieval effectiveness.2
Still another study indicated that the use of most query operators such as AND, OR, MUST APPEAR (+), or PHRASE (" ") were not used by the majority of searchers and had no significant effect on coverage, relative precision, or search result ranking.3 The impact of advanced search features on retrieval effectiveness shows that the PDF format restriction search achieved the best retrieval performance among Yahoo!, Google, and Live (now called Bing). The regular search achieved the best webpage ranking performance among Yahoo!, Google, and Live.4
How effective are Google searches using advanced search features? We used search features as the independent variable and retrieval effectiveness as the dependent variable. Our study has both theoretical and practical importance. Providing insight on different search features' impact on retrieval effectiveness findings of this study can not only help end users select appropriate search features to improve search effectiveness but also assist search engine developers in optimizing search features.
METHOD AND RESULTS
We selected the following six search features from the Google search engine:
1. Title search - Search terms only appear in a webpage title.
2. PDF search - The retrieved webpages' file format is PDF.
3. Exact phrase search - Phrases in a query are exactly matched with phrases in retrieved webpages.
4. Anchor restriction search - Search terms appear in a webpage's anchor text. …