By Gunning, David; Chaudhri, Vinay K.; Clark, Peter; Barker, Ken; Chaw, Shaw-Yi; Greaves, Mark; Grosof, Benjamin; Leung, Alice; McDonald, David; Mishra, Sunil; Pacheco, John; Porter, Bruce; Spaulding, Aaron; Tecuci, Dan; Tien, Jing
AI Magazine , Vol. 31, No. 3
Chaudhri, Vinay K.
In the winter 2004 issue of AI Magazine, we reported Vulcan Inc.'s first step toward creating a question-answering system called Digital Aristotle. The goal of that first step was to assess the state of the art in applied knowledge representation and reasoning (KRR) by asking AI experts to represent 70 pages from the advanced placement (AP) chemistry syllabus and to deliver knowledge-based systems capable of answering questions from that syllabus. This article reports the next step toward realizing a Digital Aristotle: we present the design and evaluation results for a system called AURA, which enables domain experts in physics, chemistry, and biology to author a knowledge base and that then allows a different set of users to ask novel questions against that knowledge base. These results represent a substantial advance over what we reported in 2004, both in the breadth of covered subjects and in the provision of sophisticated technologies in knowledge representation and reasoning, natural language processing, and question answering to domain experts and novice users.
Project Halo is a long-range research effort sponsored by Vulcan Inc., pursuing the vision of the "Digital Aristotle" - an application containing large volumes of scientific knowledge and capable of applying sophisticated problem-solving methods to answer novel questions. As this capability develops, the project focuses on two primary applications: a tutor capable of instructing and assessing students and a research assistant with the broad, interdisciplinary skills needed to help scientists in their work. Clearly, this goal is an ambitious, longterm vision, with Digital Aristotle serving as a distant target for steering the project's near-term research and development.
Making the full range of scientific knowledge accessible and intelligible to a user might involve anything from the simple retrieval of facts to answering a complex set of interdependent questions and providing user-appropriate justifications for those answers. Retrieval of simple facts might be achieved by information- extraction systems searching and extracting information from a large corpus of text. But, to go beyond this, to systems that are capable of generating answers and explanations that are not explicitly written in the texts, requires the computer to acquire, represent, and reason with knowledge of the domain (that is, to have genuine, internal "understanding" of the domain).
Reaching this ambitious goal requires research breakthroughs in knowledge representation and reasoning, knowledge acquisition, natural language understanding, question answering, and explanation generation. Vulcan decided to approach this ambitious effort by first developing a system capable of representing and reasoning about introductory, college-level science textbooks, specifically, a system to answer questions on advanced placement (AP) exams.1
Question answering has long challenged the AI field, and several researchers have proposed question answering against college-level textbooks as a grand challenge for AI (Feigenbaum 2003, Reddy 2003). Project Halo, described in this article, provides an essential component to meet that challenge - a tool for representing and using textbook knowledge for answering questions by reasoning.
As an initial, exploratory step toward this vision, Vulcan initiated the Halo Pilot in 2002 - a sixmonth effort to investigate the feasibility of creating a scientific knowledge base capable of answering novel questions from an AP (first-year, college-level) chemistry test. Three teams - SRI International, Cycorp, and Ontoprise - developed knowledge bases for a limited section of an AP chemistry syllabus. The knowledge bases could correctly answer between 30 and 50 percent of the associated questions from the AP test (Friedland et al. 2004a, 2004b).
While encouraging, these results had limitations. Only a small subset of knowledge, from one domain, was tested - leaving the question of how well the techniques would generalize to other material and other domains. …