Academic journal article Genetics

Genomic-Assisted Prediction of Genetic Value with Semiparametric Procedures

Academic journal article Genetics

Genomic-Assisted Prediction of Genetic Value with Semiparametric Procedures

Article excerpt

ABSTRACT

Semiparametric procedures for prediction of total genetic value for quantitative traits, which make use of phenotypic and genomic data simultaneously, are presented. The methods focus on the treatment of massive information provided by, e.g., single-nucleotide polymorphisms. It is argued that standard parametric methods for quantitative genetic analysis cannot handle the multiplicity of potential interactions arising in models with, e.g., hundreds of thousands of markers, and that most of the assumptions required for an orthogonal decomposition of variance are violated in artificial and natural populations. This makes nonparametric procedures attractive. Kernel regression and reproducing kernel Hilbert spaces regression procedures are embedded into standard mixed-effects linear models, retaining additive genetic effects under multivariate normality for operational reasons. Inferential procedures are presented, and some extensions are suggested. An example is presented, illustrating the potential of the methodology. Implementations can be carried out after modification of standard software developed by animal breeders for likelihood-based or Bayesian analysis.

(ProQuest Information and Learning: ... denotes formulae omitted.)

MASSIVE quantities of genomic data are now available, with potential forenhancing accuracy of prediction of genetic value of, e.g., candidates for selection in animal and plant breeding programs or for molecular classification of disease status in subjects (GOLUB et al. 1999). For instance, WONG et al. (2004) reported a genetic variation map of the chicken genome containing 2.8 million single-nucleotide polymorphisms (SNPs) and demonstrated how the information can be used for targeting specific genomic regions. Likewise, HAYES et al. (2004) found 2507 putative SNPs in the salmon genome that could be valuable for marker-assisted selection in this species.

The use of molecular markers as aids in genetic selection programs has been discussed extensively. Important early articles are SOLLER and BECKMANN (1982) and FERNANDO and GROSSMAN (1989), with the latter focusing on best linear unbiased prediction of genetic value when marker information is used. Most of the literature on marker-assisted selection deals with the problem of locating one or few quantitative trait loci (QTL) using flanking markers. However, in the light of current knowledge about genomics, the widely used single-QTL search approach is naive, since there is evidence of abundant QTL affecting complex traits, as discussed, e.g., by DEKKERS and HOSPITAL (2002). This would support the infinitesimal model of FISHER (1918) as a sensible statistical specification for many quantitative traits, with complications being the accommodation of nonadditivity and of feedbacks (GIANOLA and SORENSEN 2004). DEKKERS and HOSPITAL (2002) observe that existing statistical methods for marker-assisted selection do not deal well with complexity posed by quantitative traits. Some difficulties are: specification of "statistical significance" thresholds for multiple testing, strong dependence of inferences on model chosen (e.g., number of QTL fitted, distributional forms), inadequate handling of nonadditivity, and ambiguous interpretation of effects in multiple-marker analysis, due to collinearity.

Here, we discuss how large-scale molecular information, such as that conveyed by SNPs, can be employed for marker-assisted prediction of genetic value for quantitative traits in the sense of, e.g., MEUWISSEN et al. (2001), GIANOLA et al. (2003), and XU (2003). The focus is on inference of genetic value, rather than detection of quantitative trait loci. A main challenge is that of positing a functional form relating phenotypes to SNP genotypes (viewed as thousands of possibly highly colinear covariates), to polygenic additive genetic values, and to other nuisance effects, such as sex or age of an individual, simultaneously.

Standard quantitative genetics theory gives a mechanistic basis to the mixed-effects linear model, treated either from classical (SORENSEN and KENNEDY 1983; HENDERSON 1984) or from Bayesian (GIANOLA and FERNANDO 1986) perspectives. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.