User Interactions with Multimedia Repositories Using Natural Language Interfaces-OntoNL: An Architectural Framework and Its Implementation

Article excerpt

ABSTRACT: We present a generalized architectural framework for constructing and using natural language interfaces for interactions with multimedia repositories. The system allows the users to specify natural language queries about the multimedia content with rich semantics. It uses an extensive set of methodologies and tools for linguistic processing, and utilizes the MPEG-7 and the domain ontologies to reduce the ambiguities in the natural language and to rank the results. We describe the implementation of this framework for supporting interactions with a multimedia repository, described with the MPEG-7 MDS (Multimedia Description Schemes) structures, that also uses User Profile information for better ranking of the result queries.

Categories and Subject Descriptors

H.5.1 [Multimedia Information Systems] H.5.2 [User Interfaces]:Natural Langauage

General Terms

Multimedia Content Retreival, Ontology, MPEG 7, Semantics

Keywords: Ontology, MPEG 7, Natural language interface, Semantics

Uncorrected Proof

1. Introduction

The advantages of natural language interfaces are well understood. They can be seen in user interaction with information repositories or when using modern interaction devices, like mobile phones, PDA's etc. and in general when the information is complex. The extended use of the web has created a need for services that will help naive users to find information they need fast and without cost by using question-answering systems. A question-answering system which amalgamates Natural Language Process-ing (NLP), Logic, Ontologies and Information Retrieval techniques to provide answers to queries in a specific domain in real time is AQUA [15]. AQUA translates English questions into logical queries that are then used to generate of proofs. AQUA is coupled with the AKT reference ontology for the academic domain. This ontology (written in OCML) currently contains people, organizations, research areas, projects, publications, technologies and events, and works as a pattern-matching, which means that it tries to find exact match with names in the ontology.

Until recently, the natural language interfaces (NLIs) between humans and machines were either specific to a particular application with limited expectations or linguistic-based with possibly many ambiguities that led to lengthy disambiguation dialogues. An attempt of using a more generalized approach of the construction of an NLI, particularly in the domain of digital TV was presented by Karanastasi et. al [2, 3] with good results when dealing with ambiguities. There was no need of using clarification dialogues for the disambiguation, because of the well-structured domain of Digital TV, the TV-Anytime standard [] and the TV-Anytime User Profile information. A limitation of such a system is that it is not reusable. This limitation stems from using a domain grammar, with specific domain grammar rules. Also, the grammar rules are defined by the syntax of the repository they refer to, a fact that can be limiting in searching in more than one ontology that describes the same domain.

The proposed architecture uses OWL Web Ontology Language and word ontologies for the disambiguation of the user's query with a preprocessing phase for the linguistically representation of the ontology for better matching. The disambiguation is the assignment of the correct sense a word can take in a particular domain. The language model is as complete as possible from the linguistic part (syntactic and semantic based on a word ontology). The approach is to be able to retrieve concept instances (OWL individuals) that are strongly related to a word from user's request even if it is not appeared inside the request. Also, we use User Profile information for better clustering based on the context of the ontologies and better ranking of the result queries.

2. The OntoNL System

The goal of the OntoNL system is to address the knowledge engineering bottleneck for natural language processing systems. …