The features of perceived objects are processed in distinct neural pathways, which call for mechanisms that integrate the distributed information into coherent representations (the binding problem). Recent studies of sequential effects have demonstrated feature binding not only in perception, but also across (visual) perception and action planning. We investigated whether comparable effects can be obtained in and across auditory perception and action. The results from two experiments revealed effects indicative of spontaneous integration of auditory features (pitch and loudness, pitch and location), as well as evidence for audio-manual stimulus-response integration. Even though integration takes place spontaneously, features related to task-relevant stimulus or response dimensions are more likely to be integrated. Moreover, integration seems to follow a temporal overlap principle, with features coded close in time being more likely to be bound together. Taken altogether, the findings are consistent with the idea of episodic event files integrating perception and action plans.
The perceived features of visual (Zeki & Bartels, 1999) and auditory (Kaas & Hackett, 1999; Lee & Winer, 2005; Wessinger et al., 2001) objects are processed in distinct neural pathways, which calls for processes that integrate this distributed information into coherent representations. This so-called binding problem and the mechanisms solving it have been studied extensively in recent years (e.g., Allport, Tipper, & Chmiel, 1985; Hall, Pastore, Acker, & Huang, 2000; Hommel, 2004; Treisman & Gelade, 1980). One of the leading theories in this field, Treisman's feature integration theory (FIT), holds that primary visual features are processed in parallel and represented in separate feature maps. Through spatial selection via a master map of locations, an episodic representation is created: an object file, which is updated as the object changes and can be addressed by location (Kahneman, Treisman, & Gibbs, 1992; Treisman, 1990; Treisman & Gelade, 1980).
Hommel (1998, 2004, 2005) extended Treisman's object file concept to include not only stimulus features, but also response-related feature information. A number of studies have provided evidence for this extension. In these studies, participants carried out two responses in a row. First, they were cued by a response cue signaling the first response, which, however, was carried out only after a visual trigger stimulus was presented. After 1 sec, another visual stimulus appeared, and the participants had to perform a binary-choice response to one of its features. As was expected, main effects of stimulus feature repetition were obtained. But more interestingly, stimulus and response repetition effects interacted: Repeating a stimulus feature sped up reaction time (RT) only if the response also repeated, whereas stimulus feature repetition slowed down RT if the response alternated. Apparently, stimulus features were bound to response features, so that repeating one retrieved the other. This created conflict in partial repetition trials-that is, when the retrieved stimulus or response feature did not match the present one. Hence, facing a particular combination of stimulus and response features seems to create a multimodal event file (Hommel, 1998, 2004), which is retrieved if at least one of the features it includes is encountered again.
The existing theories in feature integration have been based largely on experiments using visual information, but it makes sense to assume that feature integration takes place in auditory perception as well. The auditory system allows us to perceive events based on the sound produced by them. And yet, an acoustic event is commonly made up of several features, among them pitch, timbre, loudness, and spatial position. Numerous studies have been done to look into how these features are perceived; however, in everyday life, we do not perceive features in isolation but, rather, perceive coherent, integrated acoustic events. …