Optimization of Boolean Queries in Information Retrieval Systems Using Genetic Algorithms-Genetic Programming and Fuzzy Logic

By Owais, Suhail S. J. | Journal of Digital Information Management, December 2006 | Go to article overview

Optimization of Boolean Queries in Information Retrieval Systems Using Genetic Algorithms-Genetic Programming and Fuzzy Logic


Owais, Suhail S. J., Journal of Digital Information Management


ABSTRACT: This paper proposes to use two information retrieval system models (Boolean information retrieval model and extended Boolean (fuzzy) information retrieval model). These models differ by using Boolean queries or fuzzy weighted queries. It also proposes a way for optimizing user query for the two models by using genetic programming and fuzzy logic. And proposes to use more number of Boolean operators (AND, OR, XOR, OF, and NOT) instead of the standard Boolean operators (AND, OR, and NOT), and use weights for Boolean operators and for terms in fuzzy models.

Categories and Subject Descriptors

H.3.3 [Information Search and Retrieval]; J.3 [Life and Medical Sciences]: Biology and genetics

General Terms

Boolean operators, Fuzzy model, Information retrieval

Keywords: Boolean Query, Information Retrieval, Genetic Algorithms, Genetic Programming, Fuzzy Logic, Term Weights, and Boolean Operator Weights.

1. Introduction

One of the most pressing issues with today's explosive growth of the Internet is so-called resource discovery problem [3]. That is how to find information interest among the vast and growing amount of information available. One of the most important uses of the public network is to find suitable information for such user query request. In this paper, we discuss the use of weights for both Boolean operators and the use of more number of Boolean operators to optimize the user query. We work on two IR models (Boolean and fuzzy) to optimize the user query using one of the evolutionary algorithms--genetic programming and fuzzy logic.

Genetic algorithm was implemented in both models, but fuzzy logic was used only in fuzzy or extended Boolean model. For the both models harmonic mean measure was used to measure the IR performance. Harmonic mean was used to combine precision and recall measures both at once to improve the IR performance.

2. Motivation

Because of the widespread use of web search techniques, particularly in academics, search processes need to be understood. Many users of web are not well trained in Boolean algebra; "the problem of learning the correct interpretations of Boolean operators and their rules of precedence" [1]. Hence, the motivation of this current work is to produce two IRs models, which enable to optimize the user query. The optimized query will retrieve the most relevant documents with less number of non-relevant documents to his/her search query. The deployment of Boolean operators (AND, OR, XOR, OF, and NOT) using harmonic mean measure improves the performance of IR.

3. Related work

The body of literature in information retrieval is filled with many papers. Generic algorithms offer more promise than rest. Masaharu et al. [17], propose to use an IR interface that employed a few number of query terms and concept categories with Boolean expressions; they use only the words that exist in the original query for reformulating the Boolean query; and their work is confined to two Boolean operators only. Cordon et al. [18], represent the query in a parse tree with maximum of 20 nodes; where they used only "AND, OR, and NOT" Boolean operators, and moreover the testing is limited to a small set of 400 documents only. The study of Kraft et al. [16] has addressed the genetic programming where they optimize user search queries and investigate whether precision or recall is more efficient objective function and presented experiments over non-fuzzy collection of documents. Cordon et al. [19] propose the use of multi objective evolutionary algorithms (EA), and they offer comparison of several EA oriented approaches for optimization of persistent search queries. Among the aforesaid studies the works of Kraft and Cordon offer more promises.

4. Information Retrieval

IR is the process of extracting useful information from databases of text documents (collection of document) via word or term searches and other techniques. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Note: primary sources have slightly different requirements for citation. Please see these guidelines for more information.

Cited article

Optimization of Boolean Queries in Information Retrieval Systems Using Genetic Algorithms-Genetic Programming and Fuzzy Logic
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen
Items saved from this article
  • Highlights & Notes
  • Citations
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Search by... Author
    Show... All Results Primary Sources Peer-reviewed

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.