Informing through User-Centered Exploratory Search and Human-Computer Interaction Strategies

By Petratos, Panagiotis | Issues in Informing Science & Information Technology, Annual 2008 | Go to article overview

Informing through User-Centered Exploratory Search and Human-Computer Interaction Strategies


Petratos, Panagiotis, Issues in Informing Science & Information Technology


Introduction

Information Retrieval is a well defined discipline with solid foundations in mathematics and other sciences. The tools utilized for Information Retrieval are created and developed from mathematical equations and scientific methods gracefully employed from analysis, trigonometry, geometry, statistics and probabilities. Mathematics is not the only science which contributes elements to Information Retrieval. Computer Science, Information Science and Library Science also contribute elements to Information Retrieval. The tools utilized for Information Retrieval are typically developed in conjunction with very powerful computers such as clusters and very large databases of corpora. Furthermore, with the proliferation of the Web the scope of Information Retrieval is broadened to address ubiquitous sources of information, mobile computing devices and users, as well as multiple formats of information beyond documents, i.e. biological, musical, visual and various technological formats such as XML.

The basic aim of text retrieval, a more traditional sub-field of Information Retrieval, is to match user queries to documents. In its' purest form Information

Retrieval serves a higher level objective, a more general aim, to assist the searcher in locating the information she seeks.

This general aim can be served by roughly dividing the work into two substantial tasks. The first task is assisting the searcher to express her information need in the most lucid way, clearly understandable for both the human user as well as the machine computer. The second task is to match the searcher's clearly defined query to the most relevant information available. The second task is machine related and typically involves matching and retrieval algorithms resident in the information retrieval system internals.

The first task is user related and typically involves human computer interaction strategies to enhance the information retrieval process. From this first user centered task a new research field has sprung called Exploratory Search (White, Drucker, Marchionini, Hearst, & Schraefel, 2007). The definition of Exploratory Search is elucidated by the IR methods which synthesize human computer interaction strategies to elicit and illuminate user search requests, semantic meanings, preferences, explicit and implicit relevance feedback to enhance information retrieval search quality.

Informing Science is an emerging trans-discipline which transcends a large variety of fields, from computer science, engineering, information systems, library science, social work, technology, communications, design, journalism in all its forms, to education. From a teleological point of view Informing Science researchers gracefully utilize information technology with epistemologies drawn from all the aforementioned fields in order to best inform their clients (Cohen, 1999).

Also from a teleological point of view, one facet of Informing Science, the process of elucidating the best methods of informing inquiring clientele, is served by user-centered exploratory search and human-computer interaction strategies (Petratos, 2007).

Herein a comparative study is presented of three leading IR systems, Google, Yahoo and Live Search. A team of human subjects is selected according to diverse and balanced criteria. The human factors method presented herein serves the IR search quality enhancement by providing a gold standard. A collective of human-computers is syllogistically designed to serve as a cooperating framework for the IR experiments. The tasks that are better suited to humans are assigned to the participants and the tasks that can be automated are assigned to the machines. A series of IR experiments is conducted to investigate whether there is overlap exact as well as partial among the selected IR systems, how it can be quantified, how it is distributed and also how search quality can be enhanced. The ensuing IR statistical results show that overlap exists among the selected three IR systems and demonstrate the comparative performance of these IR systems.

Exploratory Search Areas

In this segment the research directions followed in the new field of Exploratory Search are described. In synopsis the areas of Exploratory Search are the following.

Web Retrieval, Exploratory Search Interfaces, Implicit and Explicit Relevance Feedback, Faceted Search Interfaces, Directed Search.

These areas of Exploratory Search are all connected with user-centered design and user-system interactions as well as are all related with modern aspects of information technology and Informing Science. User-centered design and user-centered activities as well as web retrieval are the central themes which transcend all other Exploratory Search areas (White, Muresan, & Marchionini, 2006).

Web Retrieval

The web is becoming an increasingly important area of interest for information retrieval researchers due to a plethora of challenges it presents. A few of the most important challenges include the dynamic nature of the web, the gradually increasing and diverse content of the web, the various heterogeneous technological formats of information on the web, as well as the progressively rising number of users on the web (Petratos, 2006). Consequently these challenges provide a fertile ground for new approaches to the Information Retrieval process. For example, as the amount of information is increasing within a specific topic there is an increased need for clarification of the search instructions and guidance especially to inexperienced users in order to best utilize all the available search capabilities of the Information Retrieval System (Rodden, Ruthven, & White, 2007). In addition, inexperienced users will appreciate an intuitive interface which presents in separate rows a synopsis of the text as well as all the images along with their captions found in a document in a convenient standard-sized thumbnails array, see Figure 1. Also, even the experienced users will appreciate the more precise guidance and control provided by organizing all the available information into easy to understand categories, see Figure 3. Categories allow a presentation of a birds' eye view of the information in an easy to view, organized and tidy arrangement of top level hierarchies giving the freedom to the user to drill down in a desired hierarchy reaching the contained document synopses and arrays of standard-sized small icons of all the images and their captions. Hence, the focus of Information Retrieval researchers is increasingly concentrated on finding new methods for enhancing Web Retrieval.

Exploratory Search Interfaces

Exploratory Search Interfaces are well suited for users who frequently embark on web search exploration. Experienced users may embark on web search exploration for new knowledge.

[FIGURE 1 OMITTED]

For instance, a good analogy in the traditional paper world is a scenario in which a book reader browses through a volume of an encyclopedia to discover new knowledge on a specific topic which falls under a broader conceptual area.

An illustrative paradigm is a scenario of a museum sight-seeing tour where the user seeks in the museum collections of works of art broader conceptual area for previously unseen paintings with Greek mythology themes by artists of British origin.

Exploratory Search Interfaces are also an ideal match for the inexperienced user who often does not know what to seek for and requires guidance during her exploration for new information. For instance, a user who seeks to find out if a specific symptom may be associated with a condition, what are other related known symptoms to this condition, and what are the possible therapies, if any available.

[FIGURE 2 OMITTED]

An illustrative example is a scenario of a user who seeks for blurriness of the vision and the Information Retrieval System also retrieves a set of associated key-phrases found in related documents of the answer set (Figure 2).

The associated key-phrases may include other symptoms such as diplopia, double vision, speech ataxia, problems with organization and synchronization of speech and movement ataxia, loss of coordination, which may be related with the initial user query.

The previously unseen symptoms are presented to the user. If the user who possibly may have experienced one or more of the previously unseen symptoms selects and includes them in a new search the information retrieval results may improve and possibly present to the user an associated condition.

Although this search exploration may produce useful information, truly there is no automated, computerized panacea to replace the expert diagnosis provided by an experienced Medical Doctor. This simple information retrieval paradigm should only be taken as an illustrative example of a preliminary first step to inform the client.

Implicit and Explicit Relevance Feedback

Information Retrieval has been receiving the benefits from relevance feedback innovations for more than three and a half decades (Rocchio, 1971). Early relevance feedback methods have been relying on explicit responses from users in order to simply perform query expansion by including additional search terms to the initially issued query.

As information technology progressed more powerful computer systems became economically viable. These new more powerful computer systems gradually allowed more sophisticated schemes to be developed for eliciting relevance feedback from users.

For instance, users were put in a position of selecting multiple states by clicking on check boxes, list boxes, or data grids, selecting and marking sentences or paragraphs, reading and selecting synopses of documents, answering detailed questions about user preferences in order to create and save the individual profile of each user, etc. All these and more relevance feedback methods are listed under the Explicit Feedback category in Table 1.

More recently a new relevance feedback trend of more implicit, unobtrusive, inconspicuous and even stealth techniques is emerging. Under this new trend relevance feedback is not elicited in an explicit fashion by directly engaging the user in an activity which will take her away from her normal searching routine (Kelly & Teevan, 2003).

Instead the user is closely monitored by the system which tracks specific user activities in a covert manner in order to rapidly analyze the collected data and reveal what are the likely relevant documents from observing her normal search behavior (White, Ruthven, & Jose, 2002).

For instance, users are monitored to record the time they take to read text, view a video, image or other non-readable object, listen to audio excerpt of a book or other acoustic file, record the mouse clicks, scrolling, and keystrokes on the keyboard, etc. (Kelly & Belkin, 2001). All these and more relevance feedback methods are listed under the Implicit Feedback category in Table 1.

A new research direction that is currently explored by Information Retrieval researchers is to deduce what exactly the user is seeing on the computer in front of her by tracking her eye movement (Salojarvi, Puolamaki, & Kaski, 2005). For instance a user could be reading a specific text segment of the page displayed on the screen and hence that text segment would carry more weight than the unseen text.

Eye tracking may also be combined with mouse clicks in order to detect associations between the two user activities, which could be used as combined implicit feedback signals (Joachims, Granka, Pan, Hembrooke, & Gay, 2006).

Even in the case where there is a weak association between the two user activities, the click streams are flowing constantly and in the long term may yield more useful information if they are tracked and logged over a longer period of time. However, the technical difficulties with long term eye tracking are considerable as they require specialized dedicated hardware.

Faceted Search Interfaces

As the number of authors and users on the Internet increase the quantities of available online information experience an auxesis. As the amounts of available online information rise there is an increased need for clarity and succinct, pithy presentation of information to the user.

The total cost of ownership of a traditional computer information system is significantly reduced if the user interface is well designed and hence the users are more contented and more productive. Popular computer applications such as spreadsheets and document editors are frequently used by inexperienced users who benefit especially from intuitive and easy to understand graphical user interfaces.

Overall both experienced and inexperienced users benefit from clear, lucid, and easy to understand graphical user interfaces. In addition, the particular idiosyncrasies of online information retrieval systems aforementioned above create an increased need for clarity and unambiguous presentation of information. Hence, the essential characteristics of online information retrieval systems are instilled in a user interface design which is lucid, saphes, clear, ergonomic as well as laconic.

[FIGURE 3 OMITTED]

Facets are conceptual categories, which are created to organize the presentation of all the available data from a large database into an easy to view concise set of conceptual groups. There are two types of facets, flat and hierarchical. Hierarchical facets contain multiple levels of items organized in sub-categories, whereas flat facets contain only a single level of items.

For example, in Figure 3 the Facets are Gender, Country, Affiliation, Prize, and Year (Hearst, 2006). Under the Facet Gender two sub-categories are male and female, under the Facet Prize six sub-categories are chemistry, economics, literature, medicine, peace and physics. Faceted search interfaces allow the user fluid, flexible navigation, easy understanding and maintaining control of the search.

Directed Search

Directed search is a search where the user employs the assistance of an information retrieval system because she desires to find out more specific or detailed information within a more general subject.

[FIGURE 4 OMITTED]

An illustrative example is a scenario of an inexperienced user who seeks to find out more specific detailed information on something about multiple sclerosis (Figure 4). The information retrieval system accepts the user query and provides guidance to the user. The guidance is in the form of selected associated sub-topics which are presented as hyperlinks to the user.

According to which sub-topic the user will select the information retrieval system presents her a different answer set. Hence, if the user clicks on symptoms Answer Set A is shown, if she clicks on diagnosis, tests Answer Set B is displayed, if she clicks on causes, risk factors Answer Set C is portrayed and if she clicks on treatment Answer Set D is presented.

Exploratory Search Experimentation

The traditional evaluation methods for information retrieval systems are Precision and Recall which are very useful for understanding the effectiveness of information retrieval systems.

Precision = (Relevant Retrieved) / (Retrieved)

Recall = (Relevant Retrieved) / (Relevant)

Precision and Recall have been studied in depth and are well established and well documented methods of information retrieval evaluation. In addition to these evaluation methods, with the proliferation of online information retrieval systems, also known as search engines, more tools may be useful to understand efficiency, redundancy of information and avoid duplication of retrieval results.

Statistical Data Analysis

An experimental framework can be designed which supports the exploratory search paradigm. The experimental framework is designed to allow user participation, human computer interaction, whilst including three of the leading commercial search engines, Google, Yahoo and Live Search.

The experimental framework allows the participation of human subjects in information retrieval sessions with enhanced human computer interaction which allows them to provide explicit relevance feedback to the system. The users are researchers from the California State University, Stanislaus and have been selected in a diverse and balanced approach to capture a sample uniform representation of their information retrieval preferences and responses (Table 2).

In a recently published article at MIT Technology Review entitled "The evolution of web search" by Norvig the director of research at Google the same method is outlined which is utilized for enhancement of IR accuracy and search quality assurance (Norvig, 2008). Specific queries are selected randomly whilst selected users are employed to examine and evaluate how good the Google IR results are. The users are external contractors who are employed to examine the Google IR results and offer their judgments which are recorded for comparison purposes as a gold standard.

The first part of the information retrieval experiment called for users to run a specific query against all three selected search engines and compare the results to gain a better understanding of the associated overlap. The query was Q1="Shakespeare's metaphor theme" and the tasks to be executed included identifying the exact as well as the partial matches of the documents returned in the corresponding answer sets.

An exact match is referred to herein as the identical document which is found at the matching location indicated by the same Internet address written in the absolute path of the Uniform Resource Locator. A partial match is referred to herein as a similar document which is located at a similar Internet address which includes at least the same domain name and may have a different relative path to the document. The granularity of how much different is the address and the document is beyond the scope of this study and can be the subject of future research by including fuzzy indicator controls to attribute a similarity percentage to documents 70%, 80%, 90%. Each answer set processed contained one hundred documents for a total of 300 documents. The overlap in the corresponding answer sets of various search engines may be easily detectable by presenting to the user an easy to understand exploratory search visual comparison.

[FIGURE 5 OMITTED]

In Figure 5 the user views a small sample subset of the data from the first part of the information retrieval experiment in order to provide a visual proof of concept whilst averting feeling overwhelmed by the visual information overload of the very large ensuing data sets. The user selects three search engines and issues her query. The results are grouped by color for easy visual comparison. The green dots represent results by Google, Search Engine A, the blue dots represent results by Yahoo, Search Engine B and the magenta dots represent results by Live Search, Search Engine C. The overlap of the results in the answer sets is clearly identifiable in the diagram, green-blue dots correspond to overlap A-B, Google-Yahoo, green-magenta dots correspond to overlap A-C, Google-Live Search and blue-magenta dots correspond to overlap B-C, Yahoo-Live Search. The overall overlap is clearly identifiable by the ellipse in the middle which is formed by blue-magenta-green dots.

The next part of the IR experiment is to process the complete results which are comprised of three large data sets which contain a total of 300 documents. In Table 3 the associations of overlap are shown, G-Y are the Google-Yahoo co-locations, G-LS are the Google-Live Search co-locations, Y-LS are the Yahoo-Live Search co-locations.

[FIGURE 6 OMITTED]

In Figure 6 all the exact Q1 overlap results are shown. Yahoo and Live Search exhibit a higher number of exact matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Yahoo at the later IR stages of 60-80 documents. Live Search exhibits the largest number of exact matches overall compared to the other two IR systems.

In Table 4 the overlap by IR system is shown. Live Search exhibits the largest number of exact matches followed by Google and then Yahoo. As far as partial matches are concerned Yahoo comes first followed by Google and Live Search which are tied at eight partial matches.

[FIGURE 7 OMITTED]

In Figure 7 all the partial Q1 overlap results are shown. Yahoo and Live Search exhibit a higher number of partial matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Live Search and Yahoo in the subsequent IR stages of 2030 documents. Yahoo accelerates and exceeds the other two in the subsequent IR stages of 30-40 documents. Yahoo exhibits the largest number of partial matches overall compared to the other two IR systems.

Naturally when human subjects are involved in IR experimentation in order to examine, provide feedback or evaluate exploratory search systems the investigator should also consider the associated cost which includes the required time and user effort to conduct the experiments (Keskustalo, Jarvelin, & Pirkola, 2006). In Table 5 some of the possible indicators of cost evaluation of user experimentation are listed. User effort is a cost which may represent impediment to the experiments while user willingness is an advantage which may represent progress for the experiments.

The next part of the IR experiment called for a comparative relevance ranking by human subjects and IR systems. In Table 6 the top ten documents corresponding to query Q1 are ranked according to the expert and also according to Google, Yahoo and Live Search. Notice that there are no ties in the ranks. This no-ties case is simpler than the tie-corrected case which follows subsequently.

The Spearman rank correlation coefficient is computed and the individual correlations of the expert to Google, Yahoo and Live Search are listed in Table 7. The first is a strong positive correlation, the second is a positive correlation and the third is a weak positive correlation.

[FIGURE 8 OMITTED]

The results are graphed and in Figure 8 we see a strong positive correlation between the two result sets of the expert and Google SE_A.

[FIGURE 9 OMITTED]

In Figure 9 we see a positive correlation between the two result sets of the expert and Yahoo SE_B. Notice that the data points now are more scattered than before.

[FIGURE 10 OMITTED]

Additionally, in Figure 10 we see a weak positive correlation between the two result sets of the expert and Live Search SE_C. Notice that the data points now are even more scattered than the two previous cases.

The next stage is to take into account the more complex tie-corrected case. The tie-corrected case can occur by a couple of conditions. The first condition is if multiple experts assign the same weight to two or more documents. The second condition is if the computed document weights which are used to estimate the document rankings coincide for two different documents ensuing to a tie of ranks. In Table 8 the cells colored blue green and red correspond to tied ranks 9 and 10.

The Spearman tied-corrected rank correlation coefficient is computed and the individual correlations of the expert to Google, Yahoo and Live Search are listed in Table 9. The first is a strong positive correlation, the second is a positive correlation and the third is a weak positive correlation.

Notice that the correlations have slightly changed now with the tie-correction compared to before without it, see Table 7. The tie-correction change is significant in the weak positive correlation.

[FIGURE 11 OMITTED]

The results are graphed and in Figure 11 we see a strong positive correlation between the two result sets of the expert and Google SE_A.

[FIGURE 12 OMITTED]

In Figure 12 we see a positive correlation between the two result sets of the expert and Yahoo SE_B. Notice that the data points now are more scattered than before.

[FIGURE 13 OMITTED]

Figure 13. Scatter plots for tie-corrected ranking results of Expert/Live Search SE_C. Finally, in Figure 13 we see a weak positive correlation between the two result sets of the expert and Live Search SE_C. Notice that the data points now are even more scattered than the two previous cases.

In Table 10 the overlap of exact matches is listed. Yahoo and Live Search exhibit a higher number of exact matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Yahoo at the later IR stages of 60-80 documents. Live Search exhibits the largest number of exact matches overall compared to the other two IR systems.

In Table 11 the overlap of partial matches is listed. Yahoo and Live Search exhibit a higher number of partial matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Live Search and Yahoo in the subsequent IR stages of 20-30 documents. Yahoo accelerates and exceeds the other two in the subsequent IR stages of 30-40 documents. Yahoo exhibits the largest number of partial matches overall compared to the other two IR systems.

The purpose of this work is to offer new insights into the information retrieval process and illuminate the issues which affect the principal client for whom the information is intended.

Informing through exploratory search utilizes alternative information retrieval strategies which involve and engage the human user who is the primary client of informing. Various methods of eliciting and utilizing user relevance feedback have been presented along with evaluation methods more suitable for exploratory search interfaces.

In synopsis, the last five decades have been very fruitful for computing research generating great advances in the field of computing science. The current computer automation and computing power possible are orders of magnitude greater than what they were a few years ago. Still this work shows that human subjects can be very valuable especially for the enhancement of search quality of IR systems.

Conclusion

In conclusion, an IR experimental framework which supports the involvement and participation of human subjects in the IR experiments has been presented and tested. Human computer interaction plays an important role in the examination of the documents, as well as providing the gold standard for comparison of IR rankings.

The comparative statistical results given by the Spearman correlations for distinct as well as tie-corrected IR rankings, favor Google, whilst Yahoo and Live Search follow in that order. From the results of the IR experiments it is clear that overlap exists among leading IR systems.

In the case of exact overlap Yahoo and Live Search exhibit a higher number of exact matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Yahoo at the later IR stages of 60-80 documents. Yahoo exhibits the largest number of partial matches overall compared to the other two IR systems.

As far as partial overlap is concerned Yahoo and Live Search exhibit a higher number of exact matches in the early IR stages of 1-20 documents whilst Google rises much slower, then accelerates and intersects Live Search and Yahoo in the subsequent IR stages of 20-30 documents. Subsequently in partial overlap Yahoo accelerates and exceeds the other two in the subsequent IR stages of 30-40 documents. Yahoo exhibits the largest number of partial matches overall compared to the other two IR systems.

Future trends in these areas include work on similarities of corpuses, i.e. web sites and the effects of human factors involvement on multiple established IR similarity statistics such as the cosine, the overlap, Dice and Jaccard. Furthermore the topic-focused approach can be applied to multiple genres, themes, publications, authors as well as multiple information sources i.e. literary works, email, blogs, scientific journals, etc.

From the three tested IR systems, Google, Yahoo and Live Search the IR system that exhibits the highest number of exact overlap is Live Search whereas the IR system that exhibits the highest number of partial overlap is Yahoo.

Material published as part of this publication, either on-line or in print, is copyrighted by the Informing Science Institute. Permission to make digital or paper copy of part or all of these works for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage AND that copies 1) bear this notice in full and 2) give the full citation on the first page. It is permissible to abstract these works so long as credit is given. To copy in all other cases or to republish or to post on a server or to redistribute to lists requires specific permission and payment of a fee. Contact Publisher@InformingScience.org to request redistribution permission.

References

Cohen, E. (1999). Reconceptualizing information systems as a field of the transdiscipline informing science: From ugly duckling to swan. Journal of Computing and Information Technology, 7(3), 213-219.

Hearst, M. (2006). Design recommendations for hierarchical faceted search interfaces. Proceedings of the ACM SIGIR 2006 Workshop on Faceted Search, ACM, New York.

Joachims, T., Granka, L., Pan, B., Hembrooke, H., & Gay, G. (2005). Accurately interpreting click-through data as implicit feedback. Proceedings of the ACM SIGIR 2005 International Conference, ACM, New York, 154-161.

Kelly, D., & Belkin, N. (2001). Reading time, scrolling and interaction: Exploring implicit sources of user preferences for relevance feedback. Proceedings of the ACM SIGIR 2001 International Conference on Research and Development in Information Retrieval, ACM, New York, 408 - 409.

Kelly, D., & Teevan, J. (2003). Implicit feedback for inferring user preference: A bibliography. ACM SIGIR Forum, 37(2), 18-28.

Keskustalo, H., Jarvelin, K., & Pirkola, A. (2006). The effects of relevance feedback quality and quantity in interactive relevance feedback: A simulation based on user modeling. Lecture Notes in Computer Science, 3936, 191-204. Berlin, Heidelberg: Springer.

Norvig, P. (2008). The evolution of web search. MIT Technology Review, 111(1), 32-33.

Petratos, P. (2006). Information retrieval systems: A perspective on human computer interaction. Journal of Issues in Informing Science and Information Technology, 3(1), 511-518. Retrieved from http://informingscience.org/proceedings/InSITE2006/IISITPetr231.pdf

Petratos, P. (2007). Information retrieval systems: A human centered approach. Interdisciplinary Journal of Information, Knowledge, and Management, 2(1), 17-32. Retrieved from http://ijikm.org/Volume2/IJIKMv2p017-032Petratos331.pdf

Rocchio, J. (1971). Relevance feedback in information retrieval. In G. Salton (Ed.), The SMART retrieval system: Experiments in automatic document processing (pp. 313 - 323). Englewood Cliffs, New Jersey: Prentice Hall.

Rodden, K., Ruthven, I., & White, R. (Eds.). (2007). Web information seeking and interaction. Proceedings of the ACM SIGIR 2007 Workshop on Web Information Seeking and Interaction, ACM, New York.

Salojarvi, J., Puolamaki, K., & Kaski, S. (2005). Implicit relevance feedback from eye movements. Proceedings of ICANN 2005 Artificial Neural Networks: Biological Inspirations, Lecture Notes in Computer Science, 3696, 513-518. Berlin, Heidelberg: Springer.

White, R., Drucker, S., Marchionini, G., Hearst, M., & Schraefel, M. (Eds.). (2007). Exploratory search and HCI: Designing and evaluating interfaces to support exploratory search interaction. Proceedings of the ACM SIGCHI 2007 Workshop on Exploratory Search and HCI, ACM, New York.

White, R., Muresan, G., & Marchionini, G. (Eds.). (2006). Evaluating exploratory search systems. Proceedings of the ACM SIGIR 2006 Workshop on Evaluating Exploratory Search Systems, ACM, New York.

White, R., Ruthven, I., & Jose, J. (2002). The use of implicit evidence for relevance feedback in web retrieval. Proceedings of the 24th BCS-IRSG European Colloquium on IR Research ECIR 2002 Lecture Notes in Computer Science, 2291, 93-109. Berlin, Heidelberg: Springer.

Panagiotis Petratos

California State University, Stanislaus, CA, USA

ppetratos@csustan.edu

Biography

Panagiotis Petratos is Assistant Professor of Computer Information Systems at California State University, Stanislaus. His research interests include information retrieval systems, human computer interaction, networking, computer security and biometrics enabled computer systems.

 
Table 1: Relevance Feedback Methods. 
 
Implicit Feedback                    Explicit Feedback 
 
Time taken to Read, View, or         Select documents 
  Listen 
Unprompted Selecting                 Specify keywords 
Unprompted Marking                   Mark sentences, paragraphs 
Creating, saving, or deleting        Answer questions about user's 
  a file                               interests 
Reading text, or Viewing video,      Answer questions to refine the 
  images, or other non-readable        initial search 
  objects, Eye tracking 
Listening to audio books, music,     Select a mutual exclusive state 
  or other acoustic files              by clicking a radio button, 
                                       a spin button, or a choice 
                                       from a combo box 
Find a word or phrase in a page,     Select multiple states by 
  document, book, issue query          clicking on check boxes, list 
                                       boxes, or data grids 
Bookmarking, Scrolling               Select a grade of a Likert scale 
                                       by moving a slider bar 
Key-strokes, type, edit, copy,       Rate books, documents, synopses, 
  paste, link, email, publish          images, or other non-readable 
                                       objects 
Printing                             Rank information retrieval results 
 
Table 2: User Population 
 
User ID    Gender     Age > 30   Native English 
 
1          0          0          0 
2          0          1          1 
3          1          0          0 
4          1          1          1 
 
Table 3: Associations of Q1 Overlap 
 
Overlap                  G-Y        G-LS       Y-LS 
 
Exact Co-location:       8          15         13 
Partial Co-location:     6          4          7 
 
Table 4: Q1 Overlap by IR system 
 
Overlap          Google SE_A     Yahoo SE_B     Live Search SE_C 
 
Exact Match:     22              20             27 
Partial Match:   8               12             8 
 
Table 5: Possible Indicators of User Experimentation 
Cost and Ease of Participation. 
 
User Effort                User Will 
 
Navigating Effort (1)      Willingness to Explore (1) 
Browsing Effort (2)        Willingness to Browse (2) 
Feedback Effort (1)        Willingness to provide Feedback (1) 
Cognitive Effort, time     Willingness to Learn more 
 
(1.) multiple categories, see Table 1 
(2.) Read, View, Listen, see Table 1 
 
Table 6: Comparative Ranking on Q1 by Experts and IR systems 
 
          Google     Yahoo    Live Search 
Expert    SE A       SE B     SE C           Documents 
Rank      Rank       Rank     Rank 
 
1         1          1        1              D1 
2         2          4        7              D2 
3         7          2        5              D3 
4         4          3        3              D4 
5         5          9        10             D5 
6         6          10       6              D6 
7         3          8        9              D7 
8         8          5        8              D8 
9         9          7        2              D9 
10        10         6        4              D10 
 
Table 7: Spearman rank correlation coefficient 
 
                                             Expert/Live 
Expert/Google SE_A     Expert/Yahoo SE_B     Search SE_C 
 
0.806060606            0.587878788           0.127272727 
 
Table 8: Comparative Ranking on Q1 with Document Weights 
 
                                          Live Search 
                                          SE_C Rank; 
Google SE_A Rank;     Yahoo SE_B Rank;    Document       Documents 
Document Weight       Document Weight     Weight 
 
1; 0.9                1; 0.9              1; 0.9         D1 
2; 0.8                4; 0.6              7; 0.3         D2 
7; 0.3                2; 0.8              5; 0.5         D3 
4; 0.6                3; 0.7              3; 0.7         D4 
5; 0.5                9; 0.1              10; 0.1        D5 
6; 0.4                10; 0.1             6; 0.4         D6 
3; 0.7                8; 0.2              9; 0.1         D7 
8; 0.2                5; 0.5              8; 0.2         D8 
9; 0.1                7; 0.3              2; 0.8         D9 
10; 0.1               6; 0.4              4; 0.6         D10 
 
Table 9: Spearman tie-corrected rank correlation coefficient 
 
Expert/Google SE_A     Expert/Yahoo SE_B    Expert/Live Search SE_C 
 
0.803030303            0.584848485          0.142424242 
 
Table 10: Q1 Overlap of Exact Matches 
 
                                                     Live Search 
User ID    Documents    Google SE_A    Yahoo SE_B    SE_C 
 
1          1            0              0             0 
1          2            0              1             0 
1          3            1              0             1 
1          4            0              2             0 
1          5            0              0             0 
1          6            0              3             2 
1          7            0              4             0 
1          8            0              0             0 
1          9            0              0             3 
1          10           0              5             4 
1          11           0              6             0 
1          12           0              0             5 
1          13           0              7             6 
1          14           2              0             0 
1          15           0              0             7 
1          16           0              0             8 
1          17           0              0             9 
1          18           3              0             0 
1          19           0              8             10 
1          20           0              0             0 
1          21           0              9             0 
1          22           4              10            11 
1          23           0              0             12 
1          24           0              0             0 
1          25           0              11            0 
2          26           5              0             0 
2          27           0              12            0 
2          28           0              0             0 
2          29           0              0             0 
2          30           0              0             0 
2          31           0              0             0 
2          32           0              0             0 
2          33           0              0             0 
2          34           0              13            13 
2          35           6              0             0 
2          36           0              14            0 
2          37           0              15            0 
2          38           0              0             0 
2          39           0              0             0 
2          40           0              0             0 
2          41           0              0             0 
2          42           0              0             0 
2          43           7              0             0 
2          44           0              0             0 
2          45           0              0             0 
2          46           0              0             0 
2          47           8              0             14 
2          48           9              16            15 
2          49           0              0             0 
2          50           0              0             0 
3          51           0              0             16 
3          52           0              0             0 
3          53           0              0             17 
3          54           10             0             0 
3          55           0              0             0 
3          56           0              0             0 
3          57           11             0             0 
3          58           0              0             18 
3          59           0              0             0 
3          60           12             0             19 
3          61           0              0             20 
3          62           13             0             0 
3          63           14             0             0 
3          64           0              0             0 
3          65           0              0             0 
3          66           15             0             0 
3          67           16             0             0 
3          68           0              0             21 
3          69           0              17            0 
3          70           0              0             0 
3          71           0              0             22 
3          72           0              0             0 
3          73           0              0             23 
3          74           0              0             0 
3          75           17             0             0 
4          76           0              18            0 
4          77           18             0             0 
4          78           0              0             24 
4          79           0              19            0 
4          80           0              0             0 
4          81           0              0             25 
4          82           0              0             0 
4          83           0              0             0 
4          84           0              0             0 
4          85           19             0             0 
4          86           0              0             0 
4          87           0              0             0 
4          88           0              0             26 
4          89           20             0             0 
4          90           21             20            0 
4          91           0              0             27 
4          92           0              0             0 
4          93           0              0             0 
4          94           22             0             0 
4          95           0              0             0 
4          96           0              0             0 
4          97           0              0             0 
4          98           0              0             0 
4          99           0              0             0 
4          100          0              0             0 
 
Table 11: Q1 Overlap of Partial Matches 
 
                                                    Live Search 
User ID    Documents   Google SE_A    Yahoo SE_B    SE_C 
 
1          1           0              1             0 
1          2           0              0             1 
1          3           0              2             0 
1          4           0              0             0 
1          5           0              3             0 
1          6           0              0             0 
1          7           1              0             2 
1          8           0              0             0 
1          9           0              0             0 
1          10          0              0             0 
1          11          0              0             0 
1          12          0              0             0 
1          13          2              0             0 
1          14          0              0             0 
1          15          0              0             0 
1          16          0              0             0 
1          17          0              0             0 
1          18          0              4             3 
1          19          3              0             0 
1          20          4              0             0 
1          21          0              0             0 
1          22          0              0             0 
1          23          0              0             0 
1          24          5              0             0 
1          25          0              0             0 
2          26          0              5             0 
2          27          0              0             0 
2          28          0              0             0 
2          29          0              0             0 
2          30          0              6             0 
2          31          0              0             4 
2          32          0              0             0 
2          33          0              0             0 
2          34          0              0             0 
2          35          0              7             0 
2          36          0              0             0 
2          37          0              0             0 
2          38          0              0             0 
2          39          0              0             0 
2          40          0              8             0 
2          41          0              0             0 
2          42          0              0             0 
2          43          0              0             0 
2          44          0              0             0 
2          45          0              0             0 
2          46          0              0             0 
2          47          0              0             0 
2          48          0              0             0 
2          49          0              0             0 
2          50          0              0             0 
3          51          0              0             0 
3          52          0              0             0 
3          53          0              0             0 
3          54          0              0             0 
3          55          0              0             0 
3          56          0              0             0 
3          57          0              0             0 
3          58          0              0             0 
3          59          0              0             5 
3          60          0              0             0 
3          61          0              0             0 
3          62          0              0             6 
3          63          0              0             0 
3          64          6              0             0 
3          65          0              0             0 
3          66          0              0             0 
3          67          0              9             0 
3          68          7              0             0 
3          69          0              0             0 
3          70          0              0             0 
3          71          0              0             0 
3          72          0              0             0 
3          73          0              0             0 
3          74          0              0             0 
3          75          0              0             0 
4          76          0              0             7 
4          77          0              0             0 
4          78          0              0             0 
4          79          0              0             0 
4          80          0              0             0 
4          81          0              0             0 
4          82          0              0             0 
4          83          0              0             0 
4          84          0              10            0 
4          85          0              0             0 
4          86          0              0             0 
4          87          8              0             0 
4          88          0              0             0 
4          89          0              0             8 
4          90          0              0             0 
4          91          0              0             0 
4          92          0              0             0 
4          93          0              0             0 
4          94          0              0             0 
4          95          0              11            0 
4          96          0              0             0 
4          97          0              0             0 
4          98          0              12            0 
4          99          0              0             0 
4          100         0              0             0 

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Informing through User-Centered Exploratory Search and Human-Computer Interaction Strategies
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.