Academic journal article Attention, Perception and Psychophysics

The Impact of Attention Load on the Use of Statistical Information and Coarticulation as Speech Segmentation Cues

Academic journal article Attention, Perception and Psychophysics

The Impact of Attention Load on the Use of Statistical Information and Coarticulation as Speech Segmentation Cues

Article excerpt

In two artificial language learning experiments, we investigated the impact of attention load on segmenting speech through two sublexical cues: transitional probabilities (TPs) and coarticulation. In Experiment 1, we observed that coarticulation processing was resilient to high attention load, whereas TP computation was penalized in a graded manner. In Experiment 2, we showed that encouraging participants to actively search for "word" candidates enhanced overall performance but was not sufficient to preclude the impairment of statistically driven segmentation by attention load. As long as attentional resources were depleted, independently of their intention to find these "words," participants segmented only TP words with the highest TPs, not TP words with lower TPs. Attention load thus has a graded and differential impact on the relative weighting of the cues in speech segmentation, even when only sublexical cues are available in the signal.

Speech is a continuous stream, with few reliable cues about word boundaries (see, e.g., Klatt, 1980; Liberman & Studdert-Kennedy, 1978). Two major mechanisms may help solve the speech segmentation problem. First, multiple word candidates are activated by the input and compete with each other (e.g., McQueen, Norris, & Cutler, 1994; Norris, McQueen, & Cutler, 1995). Second, listeners exploit multiple sublexical cues probabilistically associated with word boundaries, such as subsegmental information (e.g., degree of coarticulation1: Fernandes, Ventura, & Kolinsky, 2007; Mattys, 2004; Mattys, White, & Melhorn, 2005), segmental information such as transitional probabilities (TPs) between adjacent syllables, where dips in TPs are treated as likely word boundaries2 (e.g., Saffran, Aslin, & Newport, 1996; Saffran, Newport, & Aslin, 1996), and suprasegmental information (e.g., lexical stress; Mattys, 2004). These sublexical cues, first reported as potential sources of noise (Christiansen & Allen, 1997), were later shown to assist language acquisition (e.g., Johnson & Jusczyk, 2001; Thiessen & Saffran, 2003) and to modulate lexical activation in adulthood (e.g., Davis, Marslen-Wilson, & Gaskell, 2002; Gow & Gordon, 1995; Salverda et al., 2007).

Mattys and colleagues (Mattys, 2004; Mattys et al., 2005) have proposed a hierarchical framework of the organization of sublexical and lexical sources of information that assist speech segmentation, in which the several types of cues are represented at three tiers. The first, top tier consists of high-level (e.g., lexical) information, which seems the most reliable information in intact listening conditions. The second, middle tier consists of segmental and subsegmental information, and the lowest tier corresponds to metrical prosody, which seems reliable in noisy listening conditions. Under this view, the involvement of any segmentation cue, either lexically driven or signal derived, is graded, since the differential weighting of the types of information available in the signal is modulated by listening conditions. This framework thus offers a much more ecological view of speech processing, because in daily communication the speech signal is often experienced under a processing load of some sort.

Recent work by Mattys, Brooks, and Cooke (2009) has shown, in addition, that the relative weighting of the available segmentation cues depends on the type of processing load, with these types being broadly categorized as perceptual load and cognitive load. Perceptual load is defined as any alteration to the signal leading to reduced acoustic integrity. Cognitive load is any load with effects arising from the recruitment of central, domain-general processing resources due to concurrent (attentional or mnemonic) processing, rather than from a distortion of the signal (see, e.g., Cooke, Lecumberri, & Barker, 2008). Mattys et al. (2009) proposed relating these concepts to prevalent constructs in psychophysics, specifically to the distinction between energetic masking and informational masking. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.