Academic journal article Journal of Digital Information Management

On the Use of Ontologies for an Optimal Representation and Exploration of the Web

Academic journal article Journal of Digital Information Management

On the Use of Ontologies for an Optimal Representation and Exploration of the Web

Article excerpt

ABSTRACT: The use and definition of ontology for the representation and the exploration of knowledge are critical issues for approaches dealing with information retrieval. In this paper, we propose a new ontology-based approach for improving the quality, in terms of relevance, of the results obtained when searching documents on the Internet. This is done by a coherent integration of ontologies, Web data and query languages. We propose new data structures built upon ontologies: the WPGraph and the [W.sup.3]Graph which allow Web data to be modelled. We also discuss the use of ontologies for an efficient exploration of the knowledge contained in our conceptual structures using ASK, a specific query language introduced in this paper. An experimental validation of our approach is proposed through a prototype supporting our innovative framework.

Categories and Subject Descriptors

D.3.3 [Language Constructs and Features]; H.2.3 [Data Description Languages]

General Terms

Ontology, Web graph, Web data

Keywords: Web search, Query Language, Ontology, Conceptual Structures, Graphs

1. Introduction

Since the introduction of the World Wide Web (WWW) by Tim Berners-Lee in the early Nineties, and more recently the definition of the Semantic Web [1] and its promising results, the interest for concepts and tools that can improve representation and retrieval of Web information is increasing. With the popularity of the WWW and an ever increasing number of Web pages, the need for tools and concepts that allow the retrieval of Web data in a powerful and relevant way is of utmost importance. Only Web search engines like Google (1) or Yahoo (2) allow users to search the Web and find these pages. As search engines index a huge amount of data, it becomes obvious that during a search many unwanted pages slip-in among and pollute the most relevant results proposed by the engines. This point can be easily highlighted, by entering a common query in Google. For instance, if a user is looking to gain information about books and journal on the subject of graph theory trees, the query "publications on trees" delivers results from totally different areas of interest. Among the results one can find documents related to botanic, but also information about genealogy or even graph theory and this force the user to spend too much time skimming the results before reaching the relevant information. This can be avoided if the initial query is built properly.

The formula for a good query, a query that will give results satisfying the user, requires a rigorous selection of keywords [11]. The role of keywords in existing Web search engines is to characterize the context of a query, called domain. The domain is simply the area of interest. Ongoing research (3) in the design of Web search engines attempts to integrate a users' behaviour by analyzing the already visited pages in order to establish their research domain. There is need for a new search method that will effectively describe the search domain. We believe that the use of tools, such as ontologies [5], could partly solve this problem. Ontologies would be used to model the domain of research that the user has in mind, which in turn will facilitate the elaboration of future queries. Ultimately, search engines would give optimal results by integrating the vocabulary contained in the ontology to filter Web pages before displaying them to end-users. Such an approach would reduce drastically the number of keywords to enter.

Users must have a good knowledge of what information exists on the Web and how it is stored. For instance, assume that you are looking for the technical specifications of the Samsung TV with the following reference WS32Z308. The query "+Samsung +WS32Z308" would prevalently be entered in Google. However, among the most relevant results, the engine returns only web sites that allow the purchase of the requested product. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.