Peter Wiemer-Hastings (PWMRHSTN@MEMPHIS.EDU) Arthur C. Graesser (A-GRAESSER@MEMPHIS.EDU) Katja Wiemer-Hastings (KWIEMER@CC.MEMPHIS.EDU) The University of Memphis, Department of Psychology, Memphis TN 38152-6400
Children tend to acquire verb meanings later than noun meanings. Intuitively, this is not surprising. We associate with verbs much of the relational information which binds a sentence together. Without the information that the verb provides, a learner should have difficulty putting the sentence together, much less figuring out what the verb means. Earlier work by the first author on the Camille (Contextual Acquisition Mechanism for Incremental Lexeme LEarning) system ( Hastings, 1994) demonstrated that by using semantic constraints on hierarchically structured action concepts, along with example sentences containing unknown (undefined) verbs, the meanings of the verbs can be inferred, in the form of mappings between the verbs and the action concepts.1 The system performed fairly well, reaching approximately 40% precision and 20% recall, by taking an extreme inductive approach, guessing the most specific (and therefore, falsifiable) meaning for a given example, and relying on later examples and its incremental approach to correct errant hypotheses.
Camille operated in conjunction with a parser ( Lytinen's Link ( 1989)), but used only the semantic argument structure that the parse provided, for example: (OBJECT *Civilian*) (ACTOR *Terrorist*). Camille did not use any knowledge about syntactic features of the sentence. Our hypothesis is that syntactic features should provide additional informative content which would allow more efficient lexical acquisition. The current work explores this hypothesis in two ways: by evaluating how well human subjects can infer missing verbs from context, and by performing statistical analyses of the syntactic and semantic features of the example sentences in the corpus to see how predictive they are of the verb.
We tested human abilities to infer word "meanings" using the Cloze procedure, where the target word is replaced by a blank in an example sentence and the subject is asked to fill in the most appropriate word. Fourteen subjects were each given 17 sentences (for the 17 verbs that Camille was evaluated on). The sentences were randomly selected from the corpus and presented in random order. Using the same method that was used for testing Camille, the scores were comparable: 38% recall, and 38% precision.2 Camille's task was somewhat easier, however, because it only had 30 action concepts to choose from and had multiple examples of a word to help refine its inferences. When the guesses of the human subjects were evaluated by a very liberal measure which accepted what amounted to sibling and parent nodes in the concept hierarchy, they scored 58% recall and precision. (Camille's performance has not yet been evaluated on this scale.) We hope to assess how well humans do with multiple examples of a word in context in future experiments. We also hope to perform a qualitative analysis of the relationship between Camille's hypotheses and those of humans. In order to evaluate the information content contained in the context of example sentences in isolation, we performed a statistical analysis of the relationships between syntactic and semantic features of the 259 sentences from the corpus which contained the target verbs. The semantic features were taken from the sentence frames provided by Wordnet ( Miller, 1990), for example, "Somebody _____s that CLAUSE" for the word "report". To remove the semantic aspects of the frame, and to separate the different syntactic features, the frames were converted into existential features like "syntactic subject" and "clausal modifier". The semantic features were taken from the selectional constraints on verbs that the Link parser used to perform the parse, for example, "actor is human".
Bivariate correlations and multiple regression analyses were performed between the verbs and the various syntactic and semantic features (41 in all) in order to assess the diagnosticity of the features for the verbs. Example findings are: "attack" was predicted when the object was *tangible* but not *human*; "accused" occurred exclusively in the corpus with an "of VERBing" clause; and "reported" (the most frequent and varied example in the corpus) was significantly predicted by 3 semantic and 2 syntactic features. These diagnosticity findings should allow domain-specific augmentations of automatic lexical acquisition mechanisms like Camille.
Hastings, P. ( 1994). Automatic Acquisition of Word Meaning from Context. Doctoral dissertation, Computer Science Department, University of Michigan, Ann Arbor.
Lytinen, S. & Roberts, S. ( 1989). Unifying linguistic knowledge. Technical report, AI Laboratory, University of Michigan, Ann Arbor, MI, 48109.
Miller, G. ( 1990). Wordnet: An on-line lexical database. International Journal of Lexicography, 3( 4).____________________
Questia, a part of Gale, Cengage Learning. www.questia.com
Publication information: Book title: Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society. Contributors: Michael G. Shafto - Editor, Pat Langley - Editor. Publisher: Lawrence Erlbaum Associates. Place of publication: Mahwah, NJ. Publication year: 1997. Page number: 1086.
This material is protected by copyright and, with the exception of fair use, may not be further copied, distributed or transmitted in any form or by any means.