Academic journal article Genetics

Combining the Meiosis Gibbs Sampler with the Random Walk Approach for Linkage and Association Studies with a General Complex Pedigree and Multimarker Loci

Academic journal article Genetics

Combining the Meiosis Gibbs Sampler with the Random Walk Approach for Linkage and Association Studies with a General Complex Pedigree and Multimarker Loci

Article excerpt

ABSTRACT

A linkage analysis for finding inheritance states and haplotype configurations is an essential process for linkage and association mapping. The linkage analysis is routinely based upon observed pedigree information and marker genotypes for individuals in the pedigree. It is not feasible for exact methods to use all such information for a large complex pedigree especially when there are many missing genotypic data. Proposed Markov chain Monte Carlo approaches such as a single-site Gibbs sampler or the meiosis Gibbs sampler are able to handle a complex pedigree with sparse genotypic data; however, they often have reducibility problems, causing biased estimates. We present a combined method, applying the random walk approach to the reducible sites in the meiosis sampler. Therefore, one can efficiently obtain reliable estimates such as identity-by-descent coefficients between individuals based on inheritance states or haplotype configurations, and a wider range of data can be used for mapping of quantitative trait loci within a reasonable time.

A linkage analysis can find patterns of inheritance states, genotype configurations, or haplotype configurations. These latent variables are essential for linkage mapping and association mapping. In linkage mapping, for example, the coefficients sharing founder genes through segregation in a recorded pedigree can be estimated on the basis of the inheritance states [i.e., pedigree-based identity-by-descent (IBD) probabilities]. In association mapping, the coefficients sharing the genes from a common ancestor beyond the recorded pedigree can be estimated on the basis of the haplotype configurations [i.e., linkage disequilibrium (LD)-based IBD probabilities].

The linkage analysis is routinely based upon observed pedigree information and marker genotypes for individuals in the pedigree. This could cause difficulties in general pedigrees as genotype probabilities are hard to derive when pedigrees are complex, especially when there are many missing genotypic data. Exact methods for linkage analysis such as pedigree peeling (ELSTON and STEWART 1971; CANNINGS et al. 1978) or chromosome peeling (LANDER and GREEN 1987) increase exponentially in computational complexity with the number of markers or the number of pedigree members. In addition, having a number of individuals with missing genotypic data severely affects the computational task in deriving such probabilities.

Markov chain Monte Carlo (MCMC) algorithms are an alternative and flexible method to estimate genotype probabilities. In early MCMC, genotypic configurations or segregation indicators as latent variables are updated at each site, which makes it possible to deal with a large proportion of missing genotypic data in a complex general pedigree (LANGE and MATTHYSSE 1989; SHEEHAN et al. 1989; THOMPSON 1994), although reducible sites often occur in complex pedigree structures and mixing problems also appear in using multiple marker loci (THOMPSON and HEATH 1999; CANNINGS and SHEEHAN 2002). By updating segregation indicators jointly for all marker loci in a single meiosis, the meiosis Gibbs sampler (THOMPSON and HEATH 1999) greatly improves mixing of the Markov chain. However, noncommunicating classes are generated when founder allelic types are determined by direct or indirect observations, which makes the chain reducible (THOMPSON and HEATH 1999; HEATH 2003). The random walk approach suggested by SOBEL and LANGE (1996) remedied the reducibility problems by taking multiple moves of the random walk that allow the chain to pass through illegal configurations of segregation indicators on its way between legal configurations of segregation indicators. However, illegal or less likely configurations are often proposed, which are mostly rejected by a Metropolis mechanism; therefore, the computational efficiency of the random walk approach is much less than that of the meiosis Gibbs sampler, where updated variables are always accepted. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.