Speech Interface Technology
Alex W. Stedmon* & Dr. Chris Baber
*Centre for Human Sciences, Defence Evaluation & Research Agency, UK.
This paper offers, by way of an introduction, some of the problems in defming stress and its effects on the production of speech in the development of Automatic Speech Recognition (ASR) systems. The main focus is the findings of a recent experiment which examines the cognitive basis of speech production by implementing a paired-associates learning paradigm.
The impetus for research in this area comes from a belief that speech exists as an untapped medium for user interfaces. This, in itself, is a highly contentious issue, for as Linde and Shively argue speech is already an active or semi-active mechanism ( Baber & Noyes, 1996). Taylor, on the other hand, argues for using "the potentially spare speech modality" ( Baber & Noyes, 1996). Whatever the outcome of the debate, speech, just like other aspects of human performance, is prone to alter when related to other demands and stressors.
The concept of stress is highly problematic to define. A recent ESCA-NATO Workshop failed to provide a single standard definition, and that it arrived at six definitions illustrates, as Cox states, how "elusive ... [and] ... poorly defined" stress can be ( Murray et al., 1996). One of the Workshop definitions, which shall be used for the purpose of this paper, is "an effect on the production of speech (manifested along a range of dimensions), caused by exposure to a stressor" ( Murray et al., 1996). As such, stress may incorporate such diverse aspects as acceleration, vibration, temperature, noise, workload, mental load, and even emotional aspects of operator performance. Each may arise in a variety of ways, manifestations and effects and, more generally, have a negative effect on