Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society

By Michael G. Shafto; Pat Langley | Go to book overview

A Computational Theory of Vocabulary Expansion

Karen Ehrlich and William J. Rapaport

Department of Computer Science and Center for Cognitive Science State University of New York at Buffalo, Buffalo, NY 14260 {ehrlich ∣ rapaport}@cs.buffalo.edu http: //www.cs.buffalo.edu/~snwiz


Abstract

As part of an interdisciplinary project to develop a computational cognitive model of a reader of narrative text, we are developing a computational theory of how natural-language- understanding systems can automatically expand their vocabulary by determining from context the meaning of words that are unknown, misunderstood, or used in a new sense. 'Context' includes surrounding text, grammatical information, and background knowledge, but no external sources. Our thesis is that the meaning of such a word can be determined from context, can be revised upon further encounters with the word, "converges" to a dictionary-like definition if enough context has been provided and there have been enough exposures to the word, and eventually "settles down" to a "steady state" that is always subject to revision upon further encounters with the word. The system is being implemented in the SNePS knowledge-representation and reasoning system.


The Project and Its Significance

We are developing a computational theory of how NLU systems (including humans) can automatically expand their vocabulary by determining from context the meaning of words that are unknown to the system, familiar but misunderstood, or used in a new sense ( Ehrlich 1995). 'Context' includes surrounding text, grammatical information, and background knowledge, but no access to a dictionary ( Zadrozny & Jensen 1991) or other external sources of information (including a human).

We take the meaning of a word (as understood by a cognitive agen) to be its position in a network of words, propositions, and other concepts ( Quillian 1968, 1969). In this (idiolectic) sense, the meaning of a word for a cognitive agent is determined by idiosyncratic experience with it. The contextual meaning described above includes a word's relation to every concept in the agent's mind. Thus, the extreme interpretation of "meaning as context" defines every word in terms of every other word an agent knows. This is circular and too unwieldy for use. In another sense, the meaning of a word is its dictionary definition, usually containing less information. Thus, we limit the connections used for the definition by selecting particular kinds of information. Not all concepts within a given subnetwork are equally salient to a dictionary-style definition of a word. People abstract certain conventional information about words to use as a definition.

We claim that a meaning for a word can be determined from any context, can be revised and refined upon further encounters with it, and "converges" to a dictionary-like definition given enough context and exposures to it. Each encounter with it yields a definition--a hypothesis about meaning. Subsequent encounters provide opportunities for unsupervised revision of this hypothesis, with no (human) "trainers" or "error-correction" techniques. The hypothesized definitions are not guaranteed to converge to a "correct" meaning (if such exists) but to one stable with respect to further encounters. Finally, no domain-specific background information is required for developing the definition.

Evidence for this can be seen in the psychological literature (below) and in informal protocols taken from subjects who reasoned out loud about their definition-forming and revision procedures when shown passages containing unknown words ( Ehrlich 1995). These same passages served as input to a computational system that develops and revises definitions in ways similar to the human subjects.

The vocabulary-expansion system is part of an interdisciplinary project developing a computational cognitive model of a reader of narrative text ( Duchan et al. 1995). To fully model a reader, it is important to model the ability to learn from reading, in particular, to expand one's vocabulary in a natural way while reading, without having to stop to ask someone or to consult a dictionary. A complete lexicon cannot be manually encoded, nor could it contain new words or new meanings ( Zernik & Dyer 1987). Text-understanding, message- processing, and information-extraction systems need to be robust in the presence of unknown expressions, especially systems using unconstrained input text and operating independently of human intervention, such as "intelligent agents". E.g., a system designed to locate "interesting" news items from an online information server should not be limited to keyword searches--if the user is interested in news items about dogs, and the filter detects items about "brachets" (a term not in its lexicon), it should deliver those items as soon as it figures out that a brachet is a kind of dog.

Two features of our system mesh nicely with these desiderata, summarized as the advantages of learning over being told: (1) Being told requires human intervention. Our system operates independently of a human teacher or trainer (with one eliminable exception). (2) Learning is necessary, since one can't predict all information needed to understand unconstrained, domain-independent text. Our system does not constrain the subject matter ("domain") of the text. Although we are primarily concerned with narrative text, our techniques are general. Given an appropriate grammar, our algorithms produce domain-independent definitions, albeit ones dependent on the system's background knowledge: The more background knowledge it has, the better its definitions will be, and

-205-

Notes for this page

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items
Notes
Cite this page

Cited page

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Note: primary sources have slightly different requirements for citation. Please see these guidelines for more information.

Cited page

Bookmark this page
Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society
Table of contents

Table of contents

Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this book

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen
Items saved from this book
  • Bookmarks
  • Highlights & Notes
  • Citations
/ 1116

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Search by... Author
    Show... All Results Primary Sources Peer-reviewed

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.