Academic journal article Genetics

Characterizing Uncertainty in High-Density Maps from Multiparental Populations

Academic journal article Genetics

Characterizing Uncertainty in High-Density Maps from Multiparental Populations

Article excerpt

ABSTRACT Multiparental populations are of considerable interest in high-density genetic mapping due to their increased levels of polymorphism and recombination relative to biparental populations. However, errors in map construction can have significant impact on QTL discovery in later stages of analysis, and few methods have been developed to quantify the uncertainty attached to the reported order of markers or intermarker distances. Current methods are computationally intensive or limited to assessing uncertainty only for order or distance, but not both simultaneously. We derive the asymptotic joint distribution of maximum composite likelihood estimators for intermarker distances. This approach allows us to construct hypothesis tests and confidence intervals for simultaneously assessing marker-order instability and distance uncertainty. We investigate the effects of marker density, population size, and founder distribution patterns on map confidence in multiparental populations through simulations. Using these data, we provide guidelines on sample sizes necessary to map markers at sub-centimorgan densities with high certainty. We apply these approaches to data from a bread wheat Multiparent Advanced Generation Inter-Cross (MAGIC) population genotyped using the Illumina 9K SNP chip to assess regions of uncertainty and validate them against the recently released pseudomolecule for the wheat chromosome 3B.

(ProQuest: ... denotes formulae omitted.)

LINKAGE maps have been fundamental to genetic analysis for many years, both for gaining a better understanding of genomic structure and for utilizing that structure to gain power in mapping gene-trait associations. For humans and many other species, high-density consensus maps have been pub- lished and used across multiple mapping studies (Murray et al. 1994; Dietrich et al. 1996; Chowdhary and Raudsepp 2006; Bult et al. 2008; Cox et al. 2009; Wong et al. 2010). However, efforts to increase the saturation of genetic maps with high-throughput genotyping are still being made in many plant species (Poland et al. 2012; Ward et al. 2013; Wang et al. 2014).

Many approaches to genetic map estimation have been proposed and are reviewed along with common challenges in Cheema and Dicks (2009). Perhaps the most challenging step in map construction is ordering markers within a linkage group. Methods for ordering markers in biparental popula- tions have been well studied and include techniques such as seriation (Buetow and Chakravarti 1987), ant colony opti- mization (Iwata and Ninomiya 2006), minimum spanning trees (Wu et al. 2008), rapid chain delineation (Nascimento et al. 2010), and simulated annealing (Van Ooijen 2011). These in turn form the basis of numerous map-construction software packages. These can be roughly divided up into those relying on multipoint approaches, which incorporate information across the genome to maximize the likelihood of the map (MAPMAKER, Lander et al. 1987; CRI-MAP, Green et al. 1990; JoinMap, Stam 1993; R/qtl, Broman et al. 2003; CARTHAGENE, deGivry et al. 2005), and those relying on two-point approaches, which achieve much greater speed by using only pairwise recombination estimates (RECORD, van Os et al. 2005; OneMap, Margarido et al. 2007; MSTmap, Wu et al. 2008; Lep-MAP, Rastas et al. 2013; HighMap, Liu et al. 2014). The gain in accuracy from multipoint approaches must therefore be balanced against the accompa- nying computational burden.

The recent increases in genotyping throughput have made high-density genetic maps increasingly valuable, both in fine-mapping trait associations and as anchors for physical maps in progress toward full sequence assembly. However, this increase has also resulted in a number of problems for map construction and ensuing analyses. First, the resolution and coverage of the map are limited by the number of individuals and design of the population. Second, the compu- tational burden of mapping thousands of markers per chro- mosome limits analysis to the fast two-point methods (Wu et al. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.