Academic journal article Genetics

Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates from Microsatellite Genotypes in the Absence of Reference Data

Academic journal article Genetics

Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates from Microsatellite Genotypes in the Absence of Reference Data

Article excerpt

ABSTRACT

The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estimate rates of allelic dropout and false alleles depend upon the availability of error-free reference genotypes or reliable pedigree data, which are often not available. We have developed a maximum-likelihood-based method for estimating these error rates from a single replication of a sample of genotypes. Simulations show it to be both accurate and robust to modest violations of its underlying assumptions. We have applied the method to estimating error rates in two microsatellite data sets. It is implemented in a computer program, Pedant, which estimates allelic dropout and false allele error rates with 95% confidence regions from microsatellite genotype data and performs power analysis. Pedant is freely available at http://www.stats.gla.ac.uk/~paulj/pedant.html.

(ProQuest-CSA LLC: ... denotes formulae omitted.)

THE importance of quantifying and accounting for stochastic genotyping errors in microsatellite-based studies is becoming ever more widely recognized. Undetected errors can impair inference across a range of fields, including forensics, genetic epidemiology, kinship analysis, and population genetics (POMPANON et al. 2005). All studies that have looked for genotyping errors have found them at appreciable levels (0.2-15% per locus; POMPANON et al. 2005). Even at low error rates, the frequency of erroneous genotypes increases rapidly with the number of marker loci assayed: from 1% in one locus, to 10% in 10 loci, to a potentially destructive 63% in 100 loci. Error-free microsatellite data sets must therefore be rare and will become rarer as improving laboratory methods allow increasing numbers of samples and markers to be assayed. Thus, although errors are most obviously harmful when genotyping highly error-prone noninvasive samples (GAGNEUX et al. 1997), they can frustrate analysis of the cleanest data, for example, in mapping genes that contribute to complex disease (FEAKES et al. 1999; WALTERS 2005). The consequences of undetected genotyping errors can be particularly adverse for parentage analysis, especially when using exclusion, where incompatibilities between candidate parents and offspring are used to exclude all but the true parent (GAGNEUX et al. 1997; JONES and ARDREN 2003). Even an error rate as low as 2% in a ninelocus data set can result in false exclusion of >20% of fathers (HOFFMAN and AMOS 2005).

Given that genotyping errors cannot be eliminated with certainty, a more pragmatic approach is to minimize (PIGGOTT et al. 2004), quantify (BONIN et al. 2004; BROQUET and PETIT 2004; HOFFMAN and AMOS 2005), and integrate them in statistical analysis (MARSHALL et al. 1998; SOBEL et al. 2002; WANG 2004). Most studies quantify error rate as a single quantity, such as error rate per allele or per single-locus genotype. However, stochastic errors (as opposed to systematic errors, for example, null alleles) can be divided into two distinct classes: allelic dropout, where one allele of a heterozygote randomly fails to PCR amplify, and false alleles, where the true allele is misgenotyped because of factors such as PCR or electrophoresis artifacts or human errors in reading and recording data (BROQUET and PETIT 2004). These two classes of error can bias analyses in fundamentally different ways. For example, a high level of undetected allelic dropout could be misinterpreted as evidence for inbreeding, while false alleles can lead to substantial overestimation of census size (WAITS and LEBERG 2000; CREEL et al. 2003).

The essential difference between the effects of the two classes of error, as far as kinship inference is concerned, is that both homozygotes and heterozygotes potentially contain false alleles, but only homozygotes can be suspected of allelic dropout. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.