Languages and Genes Attest Different Histories in Island Southeast Asia
Donohue, Mark, Denham, Tim, Oceanic Linguistics
Much work on "the Austronesians" has, either explicitly or implicitly, taken the view that the history reflected in the linguistic data largely corresponds to the histories revealed by the archaeological record and by the human genetic record, respectively (Bellwood 1995, 1997; Diamond 2000, 2001; see Donohue and Denham 2010 for review). Given that the linguistic record clearly indicates an out-of-Taiwan history, the archaeological and biological records are similarly assumed to favor an out-of-Taiwan model. (1)
Recent work by the HUGO Pan-Asian single-nucleotide polymorphism (SNP) consortium (2009) reveals a fascinating human genetic history for large parts of mainland and Island Southeast Asia, including modern Taiwan, the Philippines, Indonesia, and Malaysia Autosomal DNA variation indicates primary directions of human movement from west to east and from south to north, movements that reflect the original colonization of Southeast Asia by modern humans during the Late Pleistocene (see, for example, Macaulay et al. 2005). (2) The genetics of human populations have undergone "substantial recent admixture" at a local level (HUGO 2009), but the overall pattern has seemingly remained largely undisrupted, despite major cultural and linguistic changes across Southeast Asia during the Holocene (cf., for example, Bellwood 1997). The mapping of other genetic traits for populations in Island Southeast Asia has produced variable results in terms of understanding the history of population movements during the Holocene (Oppenheimer and Richards 2001; Hill et al. 2007; Kayser et al. 2008; Soares et al. 2008, 2011), and also in terms of correspondences between genes and languages (Lansing et al. 2007; Donohue and Denham 2010).
Significantly, the directionality of language spread inferred from historical linguistics is almost consistently opposite to that inferred from the human genetic phylogeography (after Avise 2000) for Southeast Asia. Figure 1 shows a geographic reconstruction of the HUGO phylogeny (black), and the inferred spread of three major language families: Austronesian (red, Blust 2009); Tai-Kadai (Diller, Edmonson, and Luo 2005) and SinoTibetan (Matisoff 2003), both blue; and Austroasiatic (yellow, Sidwell 2009). (3) The genetic composition of human populations in Southeast Asia is largely unchanged sin cethe Pleistocene despite the large-scale--and in Island Southeast Asia nearly wholesale--replacement of previous languages during the mid-late Holocene. In sum, the spread of contemporary language families was not associated with a significant transformation in the genetic composition of human populations across Island Southeast Asia, as has often been claimed (for example, Bellwood 1995; Diamond 2000, 2001). The generally contiguous distribution of linguistic subgroups in space correlates with the prehistoric population divergence of genetic markers (HUGO 2009:1544), and does not match the subgrouping structures predicted by the known language families in the region (for example, Blust 2009 on Austronesian and Sidwell 2009 on Austroasiatic).
[FIGURE 1 OMITTED]
In the HUGO results (summarized in figure 2 from HUGO 2009:1542), speakers of Austronesian languages on Taiwan are, in terms of human genetics, subgroups within a larger clade that contains some (but not all) speakers of other Austronesian languages.4 Significantly the Austronesian language speakers closest to the top of the "Austronesian" clade in figure 2 are in Malaysia, where they are close (in human genetic terms) to the neighboring "Austroasiatic" speaking samples. Elsewhere, we see Tai-Kadai not forming a distinct clade separate from other northern mainland groups (Chinese, Hmong, Japanese, Korean), and linguistically Dravidian genetic samples subgrouping with those from (Indo-European) Indic populations, but not with Indo-European populations from outside South Asia.
[FIGURE 2 OMITTED]
The HUGO results are noteworthy because, although they are based on specific scans, they were undertaken on large and robust sample populations. …