Exploring Lexicographic Ontologies for Hierarchically Organizing the Greek Wikipedia Articles

By Niarou, Maria; Stamou, Sofia | Journal of Digital Information Management, June 2012 | Go to article overview

Exploring Lexicographic Ontologies for Hierarchically Organizing the Greek Wikipedia Articles


Niarou, Maria, Stamou, Sofia, Journal of Digital Information Management


1. Introduction

Wikipedia is one of the most successful worldwide collaborative efforts to put together user-generated content, which could be used as an informational reference source for the online population. Currently, Wikipedia hosts millions of articles on a variety of topics, across different languages and has been incorporated into several computed-based applications. A crucial factor for its success is its open nature, which enables everyone edit, revise and/or question (via talk pages) the article contents. Considering the remarkable growth and the extensive use of Wikipedia, the question that rises naturally is: how can we assess the quality of Wikipedia or else how can we ensure that the content it provides is useful for its readers. In an attempt to shed light on the above issue, several researchers have proposed methods for assessing the quality of the Wikipedia articles and they have proposed methods for assisting Wikipedia editors provide qualitative and well-organized information in Wikipedia articles. Most of existing methods concentrate on the English Wikipedia (because it is the richest and the most widely used) although there have been successful attempts towards assessing and/or improving Wikipedia for other natural languages.

In this article, we study the structural quality of the Greek Wikipedia and specifically we investigate how we can organize the contents of the Greek Wikipedia so that we assist users experience successful navigations in its contents. Therefore our study is situated in the area of information management and combines tools and techniques from the field of natural language processing. In particular, we introduce a model which exploits the WordNet [8] semantic network for hierarchically organizing the Greek Wikipedia articles [I]. The motive for our study is to turn the Greek Wikipedia corpus into a structured data source and the reason for selecting WordNet as our reference guide for data structuring is the fact that it hierarchically organizes the concepts it contains based on their underlying semantic relations. The goal of our work is to experimentally demonstrate the contribution of semantic networks into the hierarchical organization of online unstructured content. In this respect, we have designed and implemented a model that automatically captures the underlying semantic relationships that hold between the Wikipedia categories and based on their identified semantic links, it organizes them into a thematic hierarchy.

For unravelling the semantics of the Wikipedia categories as well as for deriving evidence about their relations, our model explores the information encoded in WordNet, a rich source of highly structured semantic information. The contribution of WordNet is mainly pronounced in the process of disambiguating the terms used to name the Wikipedia categories, as we will discuss later in the paper. In brief, our model operates on a three-step approach: firstly, it matches the Wikipedia category names to their corresponding WordNet nodes in order to extract their senses. Then, it disambiguates the categories matching several WordNet nodes based on their estimated semantic similarity to other categories with which they co-occur in the Wikipedia articles. Having detected the semantics of each Wikipedia category, we borrow the hierarchical structure of the category names from WordNet and apply them for organizing the categories into thematic hierarchies. Based on the above steps, our model automatically assigns the Wikipedia categories into hierarchical structures and as such it facilitates the organization of the Wikipedia articles that have been classified to the corresponding categories. The experimental evaluation of our model indicates that WordNet is a valuable source for semantically organising unstructured thematic data.

The remainder of the paper is organized as follows. We begin our discussion with a brief overview of relevant works. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Exploring Lexicographic Ontologies for Hierarchically Organizing the Greek Wikipedia Articles
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.