Over the past twenty or so years, an approach to the study of language referred to as corpus linguistics has largely become accepted as an important and useful mode of linguistic inquiry. While corpora (or large collections of computerised texts, usually carefully sampled in order to be representative of a particular language variety) were first mainly used as aids to lexicography and pedagogy, they have more recently been deployed for a wider range of purposes. To illustrate, a sample of recent publications in linguistics includes Words and Phrases: Corpus Studies of Lexical Semantics (Stubbs 2001), Corpora in Applied Linguistics (Hunston 2002), Corpus Stylistics (Semino and Short 2004), Introducing Corpora in Translation Studies (Olohan 2004), Using Corpora in Discourse Analysis (Baker 2006), Corpora in Cognitive Linguistics: CorpusBased Approaches to Syntax and Lexis (Gries 2006), Corpus-Based Approaches to Metaphor and Metonymy (Stefanowitsch and Gries 2006) and Corpus Linguistics Beyond the Word: Corpus Research from Phrase to Discourse (Fitzpatrick 2007). What readers might note from this list is the absence of a book to date which details a corpus-based approach to sociolinguistics. Such a pairing has not been completely ignored. In their early overview of the field, McEnery and Wilson (1996) have a short section on corpora and sociolinguistics, which mainly discusses what is possible, rather than what has been done (at that point there was little to report), while Hunston (2002: 159–61) discusses how corpora can be used in order to describe sociolinguistic, diachronic and register variation. Additionally, Beeching (2006) has a short chapter on the ‘how’ and ‘why’ of sociolinguistic corpora in an edited collection by Wilson et al. These sections of books point to the fact that some form of ‘corpus sociolinguistics’ is possible, although it might appear that corpus linguistics has made only a relatively small impact on sociolinguistics.
The main question that this book seeks to answer is: how can corpus linguistics methods be used gainfully in order to aid sociolinguistic research? This book is therefore written for the sociolinguist who would like to know more about corpus techniques, and for the corpus linguist who wants to investigate sociolinguistic problems. Occurring somewhere between these two imaginary researchers are readers who may have little experience of either corpora or sociolinguistics, or readers who may know quite a bit about both. The challenge when writing a book that combines two fields is to try to keep a potentially diverse audience interested without making too many