What properties of a user interface would make you want to call it intelligent? For us, any interface that is called intelligent should at least be able to answer the six types of questions from users shown in figure 1. Being able to ask and answer these kinds of questions implies a flexible and adaptable division of labor between the human and the computer in the interaction process. Unlike most current interfaces, an intelligent user interface should be able to guide and support you when you make a mistake or if you don't know how to use the system well.
What we are suggesting here is a paradigm shift. As an analogy, consider the introduction of the undo button. This one button fundamentally changed the experience of using interactive systems by removing the fear of making accidental mistakes. Users today expect every interactive system to have an undo button and are justifiably annoyed when they can't find it. By analogy, to focus on just one of the question types in figure 1, what we are saying is that every user interface should have a "What should I do next?" button.
Note that we are not saying that each of the questions in figure 1 must literally be a separate button. The mechanisms for asking and answering these questions could be spoken or typed using natural (or artificial) language, adaptive menus, simple buttons, or some combination of these. We have experimented with all these mechanisms in the various prototype systems described later.
Finally, some readers might object that answering the question types in figure 1 should be thought of as a function of the application rather than the interface. Rather than getting into a unproductive semantic argument about the boundary between these two terms, we prefer instead to focus on what we believe is the real issue, namely, whether this characterization of intelligent user interfaces can lead to the development of a reusable middleware layer that makes it easy to incorporate these capabilities into diverse systems.
Again, there is a relevant historical analogy. A key to the success of so-called WIMP (windows, icons, menus, and pointers) interfaces has been the development of widely used middleware packages, such as MOTIF and SWING. These middleware packages embody generally useful graphic presentation and interaction conventions, such as tool bars, scroll bars, and check boxes. We believe that the next goal in user interface middleware should be to codify techniques for supporting communication about users' task structure and process, as suggested by the question types in figure 1. This article describes a system, called COLLAGEN, which is the first step in this direction.
Figure 1. Six Questions for an Intelligent Interface.
Who should/can/will do -- ?
What should I/we do next?
Where am/was I?
When did I/you/we do -- ?
Why did you/we (not) do -- ?
How do/did I/we/you do -- ?
Adapted from the news reporter's "five Ws." The blanks are filled in with application-specific terms, ranging from high-level goals, such as "prepare a market survey" or "shut down the power plant," to primitive actions, such as "underline this word" or "close valve 17."
What does all this have to do with the "collaborative discourse theory" in the title of this article? The goal of developing generic support for communicating about the user's task structure cannot, we feel, be achieved by taking an engineering approach focused directly on the questions in figure 1. We therefore started this research by looking for an appropriate theoretical foundation, which we found in the concept of collaboration.
Collaboration is a process in which two or more participants coordinate their actions toward achieving shared goals. Most collaboration between humans involves communication. Discourse is a technical term for an extended communication between two or more participants in a shared context, such as a collaboration. …