The main fallacy in this conclusion, as the reader will recognize, is the generalization from peephole observation to ordinary observation, the assumption that because the perspective structure of an optic array does not specify the surface layout nothing in the array can specify the layout. The hypothesis of invariant structure that underlies the perspective structure and emerges clearly when there is a shift in the point of observation goes unrecognized. The fact is that when an observer uses two eyes and certainly when one looks from various points of view the abnormal room and the abnormal window are perceived for what they are, and the anomalies cease.
The demonstrations do not prove, therefore, that the perception of layout cannot be direct and must be mediated by preconceptions, as Adelbert Ames and his followers wanted to believe ( Ittelson, 1952). Neither do the many other demonstrations that, over the centuries, have purported to prove it.
The diagram of equivalent configurations illustrates one of the perplexities inherent to the retinal image theory of perception: if many different objects can give rise to the same stimulus, how do we ever perceive an object? The other half of the puzzle is this: if the same object can give rise to many different stimuli, how can we perceive the object? (Note that the second question implies a moving object but that neither question admits the fact of a moving observer.) Koffka was perplexed by this dual puzzle ( 1935, pp. 228 ff.) and many other experimenters have tried to resolve it, but without success (e. g., Beck and Gibson, 1955). The only way out, I now believe, is to abandon the dogma that a retinal stimulus exists in the form of a picture. What specifies an object are invariants that are themselves "formless."
The experiment of providing either structure or no structure in the light to an eye results in the perception of a surface or no surface. The difference is not between seeing in two dimensions and seeing in three dimensions, as earlier investigators supposed.
The closer together the discontinuities in an experimentally induced optic array, the greater is the "surfaciness" of the perception. This was true, at least, for a 30° array having seven contours at one extreme and thirty-six at the other.
Optical contact of one's body with the surface of support as well as mechanical contact seem to be necessary for some terrestrial animals if they are to stand and walk normally.
Perceiving the meaning of an edge in the surface of support, either a falling-off edge or a stepping-down edge, seems to be a capability that animals develop. This is not abstract depth perception but affordance perception.