Academic journal article Genetics

A Hierarchical Bayesian Model for Next-Generation Population Genomics

Academic journal article Genetics

A Hierarchical Bayesian Model for Next-Generation Population Genomics

Article excerpt

ABSTRACT

The demography of populations and natural selection shape genetic variation across the genome and understanding the genomic consequences of these evolutionary processes is a fundamental aim of population genetics. We have developed a hierarchical Bayesian model to quantify genome-wide population structure and identify candidate genetic regions affected by selection. This model improves on existing methods by accounting for stochastic sampling of sequences inherent in next-generation sequencing (with pooled or indexed individual samples) and by incorporating genetic distances among haplotypes in measures of genetic differentiation. Using simulations we demonstrate that this model has a low false-positive rate for classifying neutral genetic regions as selected genes (i.e., [straight phi]^sub ST^ outliers), but can detect recent selective sweeps, particularly when genetic regions in multiple populations are affected by selection. Nonetheless, selection affecting just a single population was difficult to detect and resulted in a high false-negative rate under certain conditions. We applied the Bayesian model to two large sets of human population genetic data. We found evidence of widespread positive and balancing selection among worldwide human populations, including many genetic regions previously thought to be under selection. Additionally, we identified novel candidate genes for selection, several of which have been linked to human diseases. This model will facilitate the population genetic analysis of a wide range of organisms on the basis of next-generation sequence data.

(ProQuest: ... denotes formulae omitted.)

THE distribution of genetic variants among populations is a fundamental attribute of evolutionary lineages. Population genetic diversity shapes contemporary functional diversity and future evolutionary dynamics and provides a record of past evolutionary and demographic processes. Methods to quantify genetic diversity among populations have a long history (Wright 1951;Holsinger andWeir 2009) and provide a basis to distinguish neutral and adaptive evolutionary histories, to reconstruct migration histories, and to identify genes underlying diseases and other significant traits (Bamshad and Wooding 2003; Tishkoff and Verrelli 2003; Voight et al. 2006; Barreiro et al. 2008; Lohmueller et al. 2008;Novembre et al. 2008;Tishkoff et al. 2009; Hohenlohe et al. 2010). For example, population genetic analyses in humans have resolved a history of natural selection and independent origins of lactase persistence in adults in Europe and East Africa (Tishkoff et al. 2007). Similarly, allelic diversity at the Duffy blood group locus is consistent with the action of natural selection, and one allele that has gone to fixation in sub-Saharan populations confers resistance to malaria (Hamblin and Di Rienzo 2000; Hamblin et al. 2002). These studies of natural selection, and many others involving a diversity of organisms, utilize contrasts between genetic differentiation at putatively selected lociandtheremainder of thegenome.Genomicdiversity within and among populations is determined primarily by mutation and neutral demographic factors, such as effective population size and rates of migration among populations (Wright 1951; Slatkin 1987). Specifically, these demographic factors determine the rates of genetic drift and population differentiation across the genome. In contrast, selection affects variation in specific regions of the genome, including the direct targets of selection and to a lesser extent genetic regions in linkage disequilibrium with these targets (Maynard-Smith and Haigh 1974; Slatkin and Wiehe 1998; Gillespie 2000; Stajich and Hahn 2005). Thus, the genomic consequences of selection are superimposed on the genomic outcomes of neutral genetic differentiation and the two must be disentangled to identify regions of the genome affected by selection.

A variety ofmodels and methods have been proposed to identify genetic regions that have been affected by selection (Nielsen 2005), and these population genetic methods can be divided into within-population and among-population analyses. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.