Academic journal article Genetics

A Markov Chain Monte Carlo Approach for Joint Inference of Population Structure and Inbreeding Rates from Multilocus Genotype Data

Academic journal article Genetics

A Markov Chain Monte Carlo Approach for Joint Inference of Population Structure and Inbreeding Rates from Multilocus Genotype Data

Article excerpt

ABSTRACT

Nonrandom mating induces correlations in allelic states within and among loci that can be exploited to understand the genetic structure of natural populations (WRIGHT 1965). For many species, it is of considerable interest to quantify the contribution of two forms of nonrandom mating to patterns of standing genetic variation: inbreeding (mating among relatives) and population substructure (limited dispersal of gametes). Here, we extend the popular Bayesian clustering approach STRUCTURE (PRITCHARD et al. 2000) for simultaneous inference of inbreeding or selfing rates and population-of-origin classification using multilocus genetic markers. This is accomplished by eliminating the assumption of Hardy-Weinberg equilibrium within clusters and, instead, calculating expected genotype frequencies on the basis of inbreeding or selfing rates. We demonstrate the need for such an extension by showing that selfing leads to spurious signals of population substructure using the standard STRUCTURE algorithm with a bias toward spurious signals of admixture. We gauge the performance of our method using extensive coalescent simulations and demonstrate that our approach can correct for this bias. We also apply our approach to understanding the population structure of the wild relative of domesticated rice, Oryza rufipogon, an important partially selfing grass species. Using a sample of n = 16 individuals sequenced at 111 random loci, we find strong evidence for existence of two subpopulations, which correlates well with geographic location of sampling, and estimate selfing rates for both groups that are consistent with estimates from experimental data (s [asymptotically =] 0.48-0.70).

(ProQuest-CSA LLC: ... denotes formulae omitted.)

UNDERSTANDING the mating structure of natural populations is a major goal of population biology. Here we consider the problem of using genotype data from a sample of individuals to distinguish between two forms of nonrandom mating: inbreeding or mating among relatives and population subdivision or limited dispersal of gametes. As Sewall Wright demonstrated, both of these evolutionary forces induce a correlation in allelic state among uniting gametes (i.e., autozygosity) (Wright 1931, 1965). Specifically, writing {A^sub i^, A^sub j^} to denote the outcome of inheriting alleles i and j at a particular locus of interest, Wright thought about the problem in terms of the correlation in state:

...

In a randomly mating population, the probability of inheriting a combination of alleles {A^sub i^, A^sub j^} is, by definition, given by the product of their marginal probabilities (i.e., p^sub ij^ = p^sub i^p^sub j^). Therefore, under random mating there is no correlation in allelic state among the genes inherited from the two parents.

In a subdivided population with inbreeding, however, the correlation in allelic state, F^sub IT^, may be nonzero and is given by Wright's famous equation

F^sub IT^ = 1 - (1 - F^sub IS^)(1 - F^sub ST^), (1)

where F^sub IS^ is equivalent to the correlation in state conditional on subpopulation of origin, and F^sub ST^ is the correlation in state among randomly sampled alleles within subpopulations. The first is a measure of inbreeding and the second is a measure of population substructure. This equation demonstrates that the relative contribution of the two forces to deviations from random mating are of comparable magnitude and depend critically on the particular values of the parameters.

Although this phenomenon is appreciated bymany population geneticists, many modern statistical approaches for analyzing genotype data ignore one of these two components. For example, methods for identifying population structure among a sample of individuals assume random mating within subpopulations (Pritchard et al. 2000; Dawson and Belkhir 2001; Corander et al. 2003; Falush et al. 2003). Likewise, methods for estimating self-fertilization rates from genotype data assume individuals are sampled from a single population (Ayres and Balding 1998; Enjalbert and David 2000) or require labor-intensive approaches such as progeny arrays (direct genotyping of offspring-mother pairs) (Ritland 2002). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.