Academic journal article Genetics

A Bayesian Heterogeneous Analysis of Variance Approach to Inferring Recent Selective Sweeps

Academic journal article Genetics

A Bayesian Heterogeneous Analysis of Variance Approach to Inferring Recent Selective Sweeps

Article excerpt


The distribution of microsatellite allele sizes in populations aids in understanding the genetic diversity of species and the evolutionary history of recent selective sweeps. We propose a heterogeneous Bayesian analysis of variance model for inferring loci involved in recent selective sweeps by analyzing the distribution of allele sizes at multiple loci in multiple populations. Our model is shown to be consistent with a multilocus test statistic, ln RV, proposed for identifying microsatellite loci involved in recent selective sweeps. Our methodology differs in that it accepts original allele size data rather than summary statistics and allows the incorporation of prior knowledge about allele frequencies using a hierarchical prior distribution consisting of log normal and gamma probability distributions. Interesting features of the model are its ability to simultaneously analyze allele size data for any number of populations and to cope with the presence of any number of selected loci. The utility of the method is illustrated by application to two sets of microsatellite allele size data for a group of West African Anopheles gambiae populations. The results are consistent with the suppressed-recombination model of speciation, and additional candidate loci on chromosomes 2 (079 and 175) and 3 (088) are discovered that escaped former analysis.

(ProQuest Information and Learning: ... denotes formulae omitted.)

UNDERSTANDING which regions of the genome have been acted on by selection facilitates our understanding of the genetic basis of species-specific differences and allows us to identify genomic regions of functional and medical importance. Over the last few decades, various approaches for identifying genes as targets of selection have been proposed. Some of these approaches require prior knowledge of the location and function of candidate genes, while other methods, such as QTL mapping, require prior knowledge of the phenotypic trait of adaptive relevance and its pattern of heredity (LANGE 1997).

Through the availability of completely sequenced genomes and the advent of genomewide scanning, it has become unnecessary to have prior knowledge of a genomic region to infer whether or not it has been the target of selection (LUIKART 2003). A number of tests of neutrality have been proposed that are based purely on allelic distributions and levels of variability (NIELSEN 2001). These are based on variability at a single locus (EWENS 1972; TAJIMA 1989), allelic variability at multiple loci (LEWONTIN and KRAKAUER 1973; Hudson et al. 1987; SCHLÖTTERER 2001), and comparisons of variability or divergence between different classes of mutations within a locus (MCDONALD and KREITMAN 1991; GOLDMAN and YANG 1994).

Tests of neutrality based on a single locus, such as Tajima's D (TAJIMA 1989), run into difficulties because it is difficult to distinguish between a reduction of variance in allele size due to selection and a reduction due to a population bottleneck (SIMONSEN et al. 1995). Such tests run the risk of becoming tests of the equilibrium neutral population model rather than tests of selective neutrality. Tests of neutrality based on multiple loci, such as the HKA test (HUDSON et al. 1987) and the ln RV test (SCHLÖTTERER 2001), avoid these concerns. This is because, while neutral loci are similarly affected by demography and evolutionary history, the distribution of alleles in selected loci is affected differently from neutral loci and hence displays outlier patterns.

Hunting for selected loci can be done using a variety of natural genetic markers. Two common families of markers used for detecting selective sweeps are microsatellites and SNPs. Most research to date has been conducted using microsatellites, which, while less prolific than SNPs, have the benefit of being multiallelic markers and hence are highly informative (SCHLÖTTERER and WIEHE 1999). Microsatellites are tandem repeats of short DNA segments that are typically between 1 and 5 bp in length, and their alleles are defined by the number of DNA segment repeats that are present at a particular locus. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.