Achieving human-level intelligence in cognitive systems requires a number of core capabilities, including planning, belief representation, communication ability, emotional reasoning, and most importantly, a way to integrate these capabilities. And yet, for many researchers, software integration is often regarded as a kind of necessary evil--something to make sure that all the research components of a large system fit together and interoperate properly--but not something that is likely to contribute new research insights or suggest new solutions. We have found, on the contrary, that the conventional wisdom about integration does not hold: as we describe in this article, (1) the integration process has raised new research issues and at the same time has suggested new approaches to long-standing issues. We begin with a brief description of the background behind our work in training and the approach we have taken to improving training. We then describe the technology components we have developed, the system architecture we use, and we conclude with some of the insights we have gained from the integration process.
Virtual Humans for Training
We have been constructing virtual humans to explore research issues in achieving cognitive systems with human-level performance. These issues, which we describe in detail below, span a number of technical areas in artificial intelligence including speech recognition, natural language understanding and generation, dialogue modeling, nonverbal communication, task modeling, social reasoning, and emotion modeling.
Virtual humans are software artifacts that look like, act like, and interact with humans but exist in virtual environments. We have been exploring the use of virtual humans to create social training environments, environments where a learner can explore stressful social situations in the safety of a virtual world.
We designed the Mission Rehearsal Exercise (MRE) system to demonstrate the use of virtual human technology to teach leadership skills in high-stakes social situations. MRE places the trainee in an environment populated with virtual humans. The training scenario we are currently using is situated in a small town in Bosnia. It opens with a lieutenant (the trainee) in his Humvee. Over the radio, he gets orders to proceed to a rendezvous point to meet up with his soldiers to plan a mission to assist in quelling a civil disturbance. When he arrives at the rendezvous point, he discovers a surprise (see figure 1). One of his platoon's Humvees has been involved in an accident with a civilian car. There is a small boy on the ground with serious injuries, a frantic mother, and a crowd is starting to form. A TV camera crew shows up and starts taping. What should the lieutenant do? Should he stop and render aid? Or should he continue on with his mission? Depending on decisions he makes, different outcomes will occur.
[FIGURE 1 OMITTED]
Our virtual humans build on prior work in the areas of embodied conversational agents (Cassell et al. 2000) and animated pedagogical agents (Johnson, Rickel, and Lester 2000), but they integrate a broader set of capabilities than any prior work. For the types of training scenarios we are targeting, the virtual humans must integrate three broad influences on their behavior: they must perceive and act in a three-dimensional virtual world, they must engage in face-to-face spoken dialogues with people and other virtual humans in such worlds, and they must exhibit humanlike emotions. Classic work on virtual humans in the computer graphics community focused on perception and action in three-dimensional worlds (Badler, Phillips, and Webber 1993; Thalmann 1993), but largely ignored dialogue and emotions. Several systems have carefully modeled the interplay between speech and nonverbal behavior in face-to-face dialogue (Cassell et al. 2000; Pelachaud, Badler, and Steedman 1996) but these virtual humans did not include emotions and could not participate in physical tasks in three-dimensional worlds. …