PROSICE: A Spoken English Database for Prosody Research
MARK HUCKVALE and ALEX CHENGYU FANG
Prosody--the study of the intonation, stress, and rhythm of speech--is now assuming a greater importance in phonetics, phonology, and speech technology than ever before. Once regarded as subservient to studies of segmental structure, it is now being seen as providing the 'framework' which holds different levels of phonetic description together. The recent past has seen novel views of the phonology of intonation (e.g. Pierrehumbert, 1980), a new interest in prosodic phrase structure and prominence (e.g. Liberman and Prince, 1977) and the rise of autosegmental or non-linear accounts of phonetic description which integrate metrical structure with phonetic substance (e.g. Clements and Keyser, 1983). The role of prosody is also changing in speech synthesis and recognition. In speech synthesis, the success of concatenative systems--whereby recorded segments of speech are glued together to make novel utterances--has meant that the key issues have changed from segmental to supra-segmental quality ( Klatt, 1987). In speech recognition the increasing emphasis on dialogue systems has meant more research is taking place into the automatic determination of prosodic structure for the purposes of utterance disambiguation (e.g. Wightman and Ostendorf, 1995).
Contemporaneous with the development of prosody research has been the increasing influence of corpus-based research throughout speech technology and experimental phonetics. This has been driven by the huge appetite of current speech recognition research for large quantities of controlled recordings. As an example of this trend in prosody, the prediction of segment durations in speech synthesis is now commonly generated from a multiple regression analysis performed upon a database of transcribed spoken speech ( van Santen, 1993).
The combination of these two trends has created a demand for publicly available corpora of spoken recordings for the scientific research and technological application of prosody. In this chapter we look at the requirements and existing corpora and describe a new spoken English database with novel characteristics. Our database is derived from ICE-GB, the British component of the International Corpus of English (ICE).