While the previous chapter examined the ways in which corpus collections of texts from different time periods can be compared in order to identify linguistic (and social) change, this chapter focuses on comparisons of corpus texts from the same time period (or thereabouts). I have tried to limit this chapter to studies which address the sorts of questions and topics that sociolinguists are likely to be interested in, although it should be noted that there is a wide range of corpus-based studies of synchronic variation which are not covered here, despite having some sort of sociolinguistic aspect to them. For example, I do not address studies of learner corpora or ‘translationese’ in this chapter. The former involve corpora which contain texts (normally essays) that have been written by learners of a particular language (often English). Such texts can usually be compared according to variables such as ‘years of learning’ or ‘first language’ (see Granger 1998), and could even be viewed as having a diachronic aspect to them. The latter involve studies of parallel corpora, normally where texts have been translated from one language into another (see Olohan 2004). Instead this chapter focuses on comparisons between corpora of different varieties of English, as well as looking at how such studies could be combined with analysis of diachronic or cultural change. Additionally, I examine a number of statistical techniques that have been suggested for the comparison of different varieties of corpora. This chapter focuses on differences between national varieties rather than regional varieties of English because, at the time of writing, most corpus building and research has focused on building comparable national varieties. It is expected that in future, more comparable corpora of regional varieties will be available. The varieties discussed in this chapter therefore represent standard (or standardising) varieties of English as used in the particular country that they were collected in.
As noted in Chapter 3, the Brown family is not restricted to British and American corpora. The sampling model has been used (with some adaptations) to create English corpora for other nationalities: the Kolhapur Corpus of Indian English, the