Dynamic Vision-Based Intelligence
Dickmanns, Ernst D., AI Magazine
A synthesis of methods from cybernetics and AI yields a concept of intelligence for autonomous mobile systems that integrates closed-loop visual perception and goal-oriented action cycles using spatiotemporal models. In a layered architecture, systems dynamics methods with differential models prevail on the lower, data-intensive levels, but on higher levels, AI-type methods are used. Knowledge about the world is geared to classes of objects and subjects. Subjects are defined as objects with additional capabilities of sensing, data processing, decision making, and control application. Specialist processes for visual detection and efficient tracking of class members have been developed. On the upper levels, individual instantiations of these class members are analyzed jointly in the task context, yielding the situation for decision making. As an application, vertebrate-type vision for tasks in vehicle guidance in naturally perturbed environments was investigated with a distributed PC system. Experimental results with the test vehicle VAMORs are discussed.
During and after World War II, the principle of feedback control became well understood in biological systems and was applied in many technical disciplines for unloading humans from boring work load in system control and introducing automatic system behavior. Wiener (1948) considered it to be universally applicable as a basis for building intelligent systems and called the new discipline cybernetics (the science of systems control). After many early successes, these arguments soon were oversold by enthusiastic followers; at that time, many people realized that high-level decision making could hardly be achieved on this basis. As a consequence, with the advent of sufficient digital computing power, computer scientists turned to descriptions of abstract knowledge and created the field of AI (Miller, Gallenter, and Pribram 1960; Selfridge 1959). (1) With respect to results promised versus those realized, a similar situation to that with cybernetics developed in the last quarter of the twentieth century.
In the context of AI, the problem of computer vision has also been tackled (see, for example, Mart ; Rosenfeld and Kak ; Selfridge and Neisser ). The main paradigm initially was to recover three-dimensional (3D) object shape and orientation from single images or from a few viewpoints. In aerial or satellite remote sensing, the task was to classify areas on the ground and detect special objects. For these purposes, snapshot images taken under carefully controlled conditions sufficed. Computer vision was the proper name for these activities because humans took care of accommodating all side constraints observed by the vehicle carrying the cameras.
When technical vision was first applied to vehicle guidance (Nilsson 1969), separate viewing and motion phases with static image evaluation (lasting for many minutes on remote stationary computers) were initially adopted. Even stereo effects with a single camera moving laterally on the vehicle between two shots from the same vehicle position were investigated (Moravec 1983). In the early 1980s, digital microprocessors became sufficiently small and powerful, so that on-board image evaluation in near real time became possible. The Defense Advanced Research Projects Agency (DARPA) started its program entitled On Strategic Computing in which vision architectures and image sequence interpretation for ground-vehicle guidance were to be developed (autonomous land vehicle [ALV]) (AW&ST 1986). These activities were also subsumed under the title computer vision. This term became generally accepted for a broad spectrum of applications, which makes sense as long as dynamic aspects do not play an important role in sensor signal interpretation.
For autonomous vehicles moving under unconstrained natural conditions at higher speeds on nonflat ground or in turbulent air, it is no longer that the computer "sees" on its own. …