Academic journal article Journal of Visual Impairment & Blindness

Discrimination and Comprehension of Synthetic Speech by Students with Visual Impairments: The Case of Similar Acoustic Patterns

Academic journal article Journal of Visual Impairment & Blindness

Discrimination and Comprehension of Synthetic Speech by Students with Visual Impairments: The Case of Similar Acoustic Patterns

Article excerpt

Abstract: This study examined the perceptions held by sighted students and students with visual impairments of the intelligibility and comprehensibility of similar acoustic patterns produced by synthetic speech. It determined the types of errors the students made and compared the performance of the two groups on auditory discrimination and comprehension.


To communicate effectively, human speech must be intelligible enough to be comprehended by listeners (Beukelman & Yorkston, 1979). Today, technological progress in digital signal processing and computer science elaborates in a systematic way the production of qualitative synthesized speech from digitally encoded information. Text-to-speech systems, which convert text into speech (Dutoit, 1997), are capable of setting any text in an auditory environment, and those trained from a corpus of natural recordings (corpus-based techniques) have the potential for an optimal quality of speech (Fellbaum & Kouroupetroglou, 2008). During the past decade, research on the enhancement of the naturalness of synthetic speech has focused mainly on the efficient modeling of the intonational contours in text-to-speech synthesis (Xydas & Kouroupetroglou, 2006).

The perception of synthetic speech is usually discussed in the literature in relation to intelligibility and comprehension. Intelligibility refers to the ability of listeners to identify individual words or smaller linguistic units correctly (Moody, Joost, & Rodman, 1987, cited in Reynolds & Jefferson, 1999). According to Ralston, Pisoni, Lively, Greene, and Mullennix (1989, cited in Koul & Clapsaddle, 2006), intelligibility is the listener's ability to recognize phonemes and words when they are presented in isolation. In contrast, comprehensibility involves the extraction of the underlying meaning from the acoustic signals of speech (Duffy & Pisoni, 1992). Kintsch and van Dijk (1978, cited in Koul, 2003), referred to comprehension as a process in which the listener constructs a coherent mental representation of the meaningful information contained in a linguistic message and relates this representation to previously or currently available information in memory. Comprehension of synthetic speech involves recognizing the stimuli presented and then performing higher-level processing to obtain meaning.

Several studies have investigated the intelligibility and comprehension of synthetic speech systems by people with no disabilities (Koul & Hanners, 1997; Mirenda & Beukelman, 1987, 1990). However, limited research is available on the perception of synthetic speech by individuals with visual impairments (see, for example, Hensil & Whittaker, 2000).

Koul and Hanners's (1997) study of single-word intelligibility issues found that the intelligibility of synthetic speech tasks (for the well-known prototype synthesizer DECTalk) was lower than in natural speech tasks in both a closed and an open response format. A review of research on the perception of synthetic sentences revealed a pattern of results similar to that observed on single-word tasks (Koul, 2003).

The comprehension of sentences and narratives has been found to be faster and more accurate when materials are presented in natural speech instead of synthetic speech (Higginbotham, Drazek, Kowarsky, Scally, & Segal, 1994). Another type of research on the comprehension of synthetic speech has used online measures, such as response latencies, to assess the cognitive load placed on individuals by synthetic speech. In such studies, significant differences have been found in listeners' abilities to comprehend synthetic compared to natural speech (Reynolds & Jefferson, 1999). According to Reynolds and Jefferson (1999), all the children who participated in their study had significantly fewer difficulties comprehending natural speech than synthetic speech. It is likely that the information-processing systems of both younger and older children were adversely affected by the impoverished acoustic-phonetic signals provided by synthetic speech. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.