Infants' Listening in Multitalker Environments: Effect of the Number of Background Talkers
Newman, Rochelle S., Attention, Perception and Psychophysics
Infants are often spoken to in the presence of background sounds, including speech from other talkers. In the present study, we compared 5- and 8.5-month-olds' abilities to recognize their own names in the context of three different types of background speech: that of a single talker, multitalker babble, and that of a single talker played backward. Infants recognized their names at a 10-dB signal-to-noise ratio in the multiple-voice condition but not in the single-voice (nonreversed) condition, a pattern opposite to that of typical adult performance. Infants similarly failed to recognize their names when the background talker's voice was reversed-that is, unintelligible, but with speech-like acoustic properties. These data suggest that infants may have difficulty segregating the components of different speech streams when those streams are acoustically too similar. Alternatively, infants' attention may be drawn to the time-varying acoustic properties associated with a single talker's speech, causing difficulties when a single talker is the competing sound.
Infants often find themselves being spoken to in the context of background sound, including speech from one or more other talkers. For example, van de Weijer (1998) recorded all of the language input to which a single child was exposed over the course of 3 weeks and reported that there were multiple people speaking simultaneously during most of the time that the infant was outside of the house (e.g., in daycare or during shopping trips). Golden and Frank (2000) measured signal-to-noise ratios (SNRs) in five occupied toddler classrooms and found that the background sound, which often consisted of speech from other children, was typically within 15 dB of the teacher's voice; moreover, during book reading time, the SNRs for different teachers averaged only 5-6 dB. These findings suggest that young children frequently are spoken to in multitalker environments (see also Manlove, Frank, & Vernon-Feagans, 2001).
In order to learn language in these multitalker settings, infants must be able to separate one stream of speech, such as that of their caregiver's voice, from others. Although a great deal of research has been focused on how adult listeners separate speech from multiple talkers (Broadbent, 1952; Brokx & Nooteboom, 1982; Cherry, 1953; Hirsh, 1950; Pollack & Pickett, 1958; Poulton, 1953; Spieth, Curtis, & Webster, 1954), there has been much less research on infants' ability to do so. Furthermore, the aspects of the signal that might make the task of separating a talker's voice from the background easier for infants have not been well studied.
Understanding the factors that affect infant performance in a multitalker environment provides information about both the acoustic cues on which infants rely in their dayto- day listening and infants' processing abilities. In order to understand what one voice says despite the presence of background speech, infants must perform a number of tasks. Understanding the limitations of these processes forms an underpinning to many theories on infant language learning.
In order to separate streams of speech, infants must first cope with energetic masking, the masking of one sound by another at the auditory periphery. Because the frequency ranges of multiple streams of speech overlap, the competing signals mask one another. For very young infants, poor spectral resolution could make this form of masking even more difficult, although several studies suggest that spectral resolution is essentially adult-like by the time an infant is 6 months of age (see, e.g., Abdala & Folsom, 1995; Olsho, 1985; Schneider, Morrongiello, & Trehub, 1990; Spetner & Olsho, 1990; cf. Werner & Bargones, 1992, for a review).
In addition to coping with energetic masking and resolving different frequency components, infants must segregate the two sources of sound, which involves analyzing a complex sound into its components and grouping acoustic properties that "belong together" (those that originate from a single source) to distinguish them from those that do not. …