We tested whether processes that evoke agency interpretations and mental state attributions also lead to adoption of the actor's visuospatial perspective by the observer. Agency and mental state interpretations were manipulated by showing different film clips involving two triangles (the Frith-Happé animations). Participants made speeded spatial decisions while watching these films. The responses in the spatial task could be either the same or different when given from the perspective of the participant versus the perspective of one of the triangles. Reaction times were longer when the perspectives of the participants and triangles differed than when they were the same. This effect increased as the need to invoke agency interpretations in order to understand the films increased, and it increased for those films that had previously been shown to evoke mental state attributions. This demonstrates that processing of an agent's behavior co-occurs with perspective adoption, even in the case in which triangles are the actors.
A crucial part of human life involves social interactions. To react adequately in these situations, it is important to take the representation of the world held by an interacting partner into account-for example, to understand what further information would be needed in a conversation or to predict actions on the basis of the assumed state of the other.
Therefore, it is not surprising to find that people are generally willing to represent the situations of others (Frith & Frith, 2006) and to do so even if this involves representing painful stimulations (Jackson, Meltzoff, & Decety, 2005). The ability to correctly represent what someone else knows requires that the visuospatial perspective (VSP) of the other be taken into account in order to understand what the other can or cannot know (Aichhorn, Perner, Kronbichler, Staffen, & Ladurner, 2006). This can then be used as a starting state for predicting how the other person feels or will act (Apperly, 2008).
That VSP taking occurs spontaneously (independently of task requirements) in the presence of humans has been shown by Tversky and Hard (2009). Tversky and colleagues asked participants to describe the spatial relationship of two objects in a picture ("in relation to the bottle, where is the book?"). In one experimental condition, a human was seated behind the two objects and faced the observer. Therefore, the book was to the right of the bottle from the observer's perspective but to the left of the bottle when seen from the perspective of the depicted person. One picture was taken while the male actor was reaching for the book, and another picture when the actor was looking at the book but not reaching. The final picture showed the same situation without a human. When the pictures contained a human, observers often spontaneously described the location of the book from the point of view of the depicted person. This tendency was further increased when the word "placed" was added to the question ("in relation to the bottle, where is the book placed?"), which, according to the authors, drew attention to the action and thereby increased the effect. These results were interpreted as showing that the participants spontaneously took the perspective of the depicted person to make sense of the situation.
Another demonstration of VSP taking in the presence of humans can be found in the study of Thomas, Press, and Haggard (2006). In the experiment, participants faced either a human model or an object (a house). The participants' task was to report a tactile cue that could be in either an anatomically same or a different position with respect to a visual cue presented on the human model or object. For example, a tactile cue to the participant's right arm could follow a visual cue on the model's right arm (anatomically the same) or on the model's left arm (same side, as seen from the participants' perspective). In the human model condition, the participants were faster for anatomically same than for different tactile-visual conditions, demonstrating that the perspective of the model played a role when the visual stimuli were coded. …