Academic journal article Genetics

Likelihood-Free Inference of Population Structure and Local Adaptation in a Bayesian Hierarchical Model

Academic journal article Genetics

Likelihood-Free Inference of Population Structure and Local Adaptation in a Bayesian Hierarchical Model

Article excerpt

ABSTRACT

We address the problem of finding evidence of natural selection from genetic data, accounting for the confounding effects of demographic history. In the absence of natural selection, gene genealogies should all be sampled from the same underlying distribution, often approximated by a coalescent model. Selection at a particular locus will lead to a modified genealogy, and this motivates a number of recent approaches for detecting the effects of natural selection in the genome as "outliers" under some models. The demographic history of a population affects the sampling distribution of genealogies, and therefore the observed genotypes and the classification of outliers. Since we cannot see genealogies directly, we have to infer them from the observed data under some model of mutation and demography. Thus the accuracy of an outlier-based approach depends to a greater or a lesser extent on the uncertainty about the demographic and mutational model. A natural modeling framework for this type of problem is provided by Bayesian hierarchical models, in which parameters, such as mutation rates and selection coefficients, are allowed to vary across loci. It has proved quite difficult computationally to implement fully probabilistic genealogical models with complex demographies, and this has motivated the development of approximations such as approximate Bayesian computation (ABC). In ABC the data are compressed into summary statistics, and computation of the likelihood function is replaced by simulation of data under the model. In a hierarchical setting one may be interested both in hyperparameters and parameters, and there may be very many of the latter-for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, which, if used naively, leads to a consequent difficulty in conditional density estimation. We develop a general method for applying ABC to Bayesian hierarchical models, and we apply it to detect microsatellite loci influenced by local selection. We demonstrate using receiver operating characteristic (ROC) analysis that this approach has comparable performance to a full-likelihood method and outperforms it when mutation rates are variable across loci.

(ProQuest: ... denotes formulae omitted.)

THE study of the effects of natural selection at the genomic level has the potential to uncover hidden aspects of the causal pathways that relate genotype to phenotype and the environment (Sabeti et al. 2007). A challenge for any such research program is to distinguish signals of selection from those of a myriad other processes (McVean and Spencer 2006), particularly those related to the demographic history of the population. The study of individual candidate loci or regions of the genome, in isolation, and without regard to the (generally unknown) demographic history of the population is unlikely to be fruitful because selection can generally be mimicked by demographic processes (Teshima et al. 2006), and, indeed, this forms the basis of many methods of simulating loci under selection (Spencer and Coop 2004). As a consequence most recent studies concentrate on large-scale surveys of genomic regions, looking for genes that are discrepant (Teshima et al. 2006). Within this framework there are two broad strands. One set of approaches is based around the idea of a "selective sweep" in which an allele increases in frequency, as a result of either a single novel mutation or a change in environment, leading to reduced diversity at linked sites (Kaplan et al. 1989). Another modeling framework is centered around the idea of "local selection" (Charlesworth et al. 1997) in which alternative alleles are favored in different environments. Unlike the selective-sweep scenario where the time of onset of the sweep is an important parameter, the local selection framework is essentially ahistorical: the allele frequencies within a deme are typically modeled by assuming migration-selection- drift balance (Wright 1931; Petry 1983). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.