THE VISIONS IMAGE-UNDERSTANDING SYSTEM
A. HANSON E. RISEMAN University of Massachusetts
In this chapter we consider some of the problems confronting the development of general integrated computer-vision systems and the status of the VISIONS project, which has become an experimental testbed for the construction of knowledge-based image interpretation systems. The goal is the construction of a symbolic representation of the three-dimensional world depicted in a two-dimensional image, including the labeling of objects, the determination of their location in space, and to the degree possible, the construction of a surface representation of the environment.
Our system involves three levels of processing for static image interpretation. Low-level processes manipulate pixel data and produce intermediate symbolic events such as regions and lines with their attributes. High-level processes focus attention on aggregates of these events via rule-based object hypotheses in order to selectively invoke schemas, which contain more complex knowledge-based interpretation strategies. Intermediate-level processes carry out grouping and reorganization of the error-prone symbolic representation extracted from the sensory data, utilizing both "top-down" control of the processing by the schema interpretation strategies as well as "bottom-up" data-directed organization of interesting perceptual events. Our design is being extended to integrate the results of motion and stereo processing throughout the three levels of processing, with depth arrays at the lowest level, partially correct surfaces at the intermediate level, as well as two-dimensional and three-dimensional motion attributes where appropriate.
In addition, a highly parallel three-level associative architecture is being developed to achieve real-time dynamic vision capabilities by al-