Academic journal article Attention, Perception and Psychophysics

Language Identification from Visual-Only Speech Signals

Academic journal article Attention, Perception and Psychophysics

Language Identification from Visual-Only Speech Signals

Article excerpt

Our goal in the present study was to examine how observers identify English and Spanish from visual-only displays of speech. First, we replicated the recent findings of Soto-Faraco et al. (2007) with Spanish and English bilingual and monolingual observers using different languages and a different experimental paradigm (identification). We found that prior linguistic experience affected response bias but not sensitivity (Experiment 1). In two additional experiments, we investigated the visual cues that observers use to complete the language-identification task. The results of Experiment 2 indicate that some lexical information is available in the visual signal but that it is limited. Acoustic analyses confirmed that our Spanish and English stimuli differed acoustically with respect to linguistic rhythmic categories. In Experiment 3, we tested whether this rhythmic difference could be used by observers to identify the language when the visual stimuli is temporally reversed, thereby eliminating lexical information but retaining rhythmic differences. The participants performed above chance even in the backward condition, suggesting that the rhythmic differences between the two languages may aid language identification in visual-only speech signals. The results of Experiments 3A and 3B also confirm previous findings that increased stimulus length facilitates language identification. Taken together, the results of these three experiments replicate earlier findings and also show that prior linguistic experience, lexical information, rhythmic structure, and utterance length influence visual-only language identification.

(ProQuest: ... denotes formula omitted.)

A large body of research has demonstrated that speech perception is multimodal in nature. In addition to the auditory properties of speech, the visual signal carries important information about the phonetic structure of the message that affects the perception and comprehension of the speech signal (see, e.g., Massaro, 1987; Sumby & Pollack, 1954; Summerfield, 1987). The visual aspects of speech have been shown to both enhance and alter the perception of the auditory speech signal for listeners with hearing impairment, as well as for normal-hearing listeners (see, e.g., R. Campbell & Dodd, 1980; Hamilton, Shenton, & Coslett, 2006; Kaiser, Kirk, Lachs, & Pisoni, 2003; Lachs, 1999; Lachs, Weiss, & Pisoni, 2002; Summerfield, 1987).

In their seminal study of audio-visual speech perception, Sumby and Pollack (1954) showed that the addition of visual information dramatically improved speech intelligibility at less favorable signal-to-noise ratios in normalhearing listeners. When presented with degraded auditory signals, the observers experienced large gains in intelligibility of speech signals in the auditory-visual conditions relative to the auditory-only conditions. The contribution of visual information to speech perception is also illustrated by the McGurk effect (McGurk & MacDonald, 1976), in which conflicting auditory and visual information alters perception. McGurk and MacDonald found that when observers were presented with mismatched auditory and visual information, they perceived a sound that was not present in either sensory modality. For example, a visual velar stop (/g/) paired with an auditory bilabial stop (/b/) was perceived as /d/. Thus, the information carried by the visual signal not only enhances speech perception, as was found by Sumby and Pollack, but can also alter the perception of auditory information, yielding a novel percept, as in the McGurk effect (see also Hamilton et al., 2006).

More recently, studies in the field of second language (L2) acquisition have shown that the addition of visual information aids in the acquisition and perception of nonnative contrasts. For example, Hardison (2003) examined the acquisition of the English /l/-/r/ contrast by native Japanese and Korean speakers who were trained using either auditory-only or auditory-visual signals. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.