The New Zealand Spoken Component of ICE: Some Methodological Challenges 1
New Zealand linguists have been involved over the last eight years in planning and collecting data for a number of different written and spoken corpora of New Zealand English. These include the Wellington Corpus of New Zealand English (WCNZE) with its one million word written and one million word spoken components, and the New Zealand contributions to the International Corpus of English (ICE) Project, which involved a total of one million words composed of representative extracts of written and spoken New Zealand English. 2 This paper describes some of the methodological problems encountered in collecting material for a spoken corpus of New Zealand English, including the issue of who counts as a speaker of New Zealand English, the problems of collecting data in particular categories, and the procedures put in place to process collected data.
The idea of collecting a Corpus of New Zealand English had been discussed by New Zealand linguists since the mid-1980s. A number of New Zealand linguists had been using corpora in their research into vocabulary ( Kennedy, 1991; Bauer and Nation, 1993), and the expression of speech functions such as quantity ( Kennedy, 1987), causation ( Kennedy and Fang, 1992) and certainty ( Holmes, 1982, 1983). They were very aware of the valuable resources which had been made available by the Brown Corpus of American English in the early 1960s, the LOB Corpus of British written English in 1987, and the LUND Corpus of British spoken English in 1980. In 1987, after much debate about design and methodology, linguists at Victoria University began collecting data for the Wellington Corpus of Written and Spoken New Zealand English. Hence, when Sidney Greenbaum proposed that an International Corpus of English should be gathered ( 1988), it seemed sensible to ensure that New Zealand linguists also collected material suitable for inclusion in that corpus.
The parameters of the International Corpus were debated and finally decided at international gatherings where it was not always possible for New Zealand linguists