Speech is a complex acoustic signal that contains both linguistic and paralinguistic information. Linguistic information includes segmental and suprasegmental as well as lexical, grammatical, and semantic features. Paralinguistic information contains extralinguistic cues that serve to identify speaker characteristics such as age, gender, voice quality, emotional state, and physical state (Abercrombie, 1967). Both sources of information (linguistic and paralinguistic) play a critical role in the perception of speech by listeners. The present study addressed the effect of extralinguistic information (speaker's gender) and prosodic information (word length) on the perception of speech by children.
EFFECT OF AGE ON SPEECH PERCEPTION
Many previous studies have examined the effect of age on various aspects of speech perception (e.g. acoustical cues, phonemes, words). Many of these studies compared the performance of young children to that of adults and reported that children are less sensitive to perceptual cues than adults (Elliott, 1986; Elliott, Longinotti, Meyer, Raz, & Zucker, 1981; Elliott & Hammer, 1988; Sussman & Carney, 1989; Sussman, 1993; Morrongiello, Robson, Best, & Clifton1984; Nittrouer & Studdert-Kennedy, 1987). For example, Ryalls and Pisoni (1997) investigated the effect of talker variability on word recognition. They found that children between the ages of 3 to 5 years were less accurate than adults at identifying words produced by multiple speakers than those spoken by a single speaker, regardless of whether the words were produced in quiet or in noise. Another set of these studies compared speech perception of children across different ages and adults. For example, Hazan & Markhan (2004) compared the performance of 7- to 8-year-olds with 11- to 12-year-olds and with a group of adults. They found that the 7- to 8-year-olds made significantly more errors than the 11- to 12-year-olds and the adults in a word recognition task with background noise. Drager, Clark-Serpentine, Johnson, and Roeser (2006) investigated the perception of words and sentences in background noise by children aged 3 to 5 years. They reported that the 3-year-olds performed more poorly than the 4- and 5-year-olds. These researchers used synthesized words and sentences that were digitized using the speech of an 11-year old female speaker.
Some researchers claim that the poorer performance of children in comparison to adults is due to the children's immature sensory processing in either the peripheral or central auditory systems for both speech and nonspeech auditory stimuli (Elliott, 1986; Elliott et al., 1981; Hall & Grose, 1991; Sussman & Carney, 1989). Other investigators claim that it is the inability of younger children to attend selectively to the task at hand that limits their performance (Allen, Wightman, Kistler, & Dolan, 1989; Morrongiello et al., 1984; Wightman, Allen, Dolan, Kistler, & Jamleson, 1989). Allen et al. (1989) concluded that processing efficiency (i.e., the ability to filter interfering noise), frequency resolution and listening performance are abilities that improve with age. Thus, better listening performance is due to maturation of the central nervous system. In other words, increasing age improves the ability to allocate the attentional mechanism.
Some studies have suggested that the type of task used will determine the age at which children will demonstrate adult-like speech perception performance. For example, when the task requires discrimination of temporal and frequency cues, performance does not become adult-like until the age of 10 to 11 years (Allen & Wightman, 1992; Sussman & Carney, 1989). However, when the task is an identification one, adult-like performance for speech sounds occurs earlier, (i.e., at about 6 years of age) (Sussman & Carney, 1989; Walley & Carrell, 1983).
EFFECT OF GENDER ON SPEECH INTELLIGIBILITY AND SPEECH PERCEPTION
Male and female glottal characteristics differ considerably (Han-son, 1995; Klatt & Klatt, 1990), and listeners are generally able to distinguish male from female voices quite easily (Nygaard, Sommer, & Pisoni, 1994; Tielen, 1992). However, gender differences are also noted in patterns of speech production. Based on an analysis of 1,680 phonetically-transcribed utterances produced by 168 U.S. English speakers in the TIMIT (Texas Instruments in Conjunction with the Massachusetts Institute of Technology) database, Byrd (1994) reported that male speech is characterized by a greater prevalence of phonological reduction phenomena than speech produced by females. These phenomena include, for example, vowel centralization, alveolar flapping, a reduced frequency of stop releases. Thus, there is some evidence that gender is a salient characteristic that could affect overall intelligibility. In fact, Bardlow, Torretta, & Pisoni, (1996), as well as Markham and Hazan (2004) found that female talkers received a significantly higher overall intelligibility score than male talkers. The intelligibility scores were awarded by a group of listeners who were asked to identify sentences that they heard. These results raised the question of what specific acoustic-phonetic characteristics led to this gender-based intelligibility difference.
Fundamental frequency. Fundamental frequency (F0) is a global speaker characteristic that typically differs markedly across males and females. The mean F0 of a speaker is related to the average pitch of a person's voice. However, it is not clear that it is an acoustic attribute that directly affects speech intelligibility. Bradlow et al. (1996) found a significantly greater F0 range for a group of female talkers (mean = 175 Hz) than for a corresponding group of male talkers (mean = 103 Hz). They claimed, how-ever, that the higher F0 mean of women might be only one of the female speech characteristics that contribute to the generally higher intelligibility of female speech relative to male speech. Other characteristics to consider are vowel production, speaking rate, and loudness.
Vowel production. Adult female vocal tracts tend to be shorter than those of adult males. Furthermore the pharynx takes up a greater proportion of overall vocal tract length in adult males than in adult females. Consequently, female formants (i.e., amplitude peaks in the frequency spectrum) tend to be higher in frequency.
Additionally, women typically produce more distinct vowels than men (Labov, 1972). This gender-based difference in the production of vowels has been demonstrated for English, Swedish, French, and Dutch speakers (Henton, 1995), as well as for Korean speakers (Yang, 1996). Furthermore, women lead vowel change by producing longer and clearer vowel variants (Jacewicz, Fox, & Salmons, 2006). Men, however, lead sound changes that further reduce the distance between vowels (Heffernan, 2007). Relation between vowel space and speech intelligibility reveals that talkers with larger vowel spaces were generally more intelligible than talkers with reduced spaces (Bradlow et al., 1996). The findings of Diehl, Lindblom, Hoemeke, and Fahey (1996) also …