Academic journal article Genetics

Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation

Academic journal article Genetics

Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation

Article excerpt

(ProQuest: ... denotes formulae omitted.)

ESTIMATION of genetic parameters, i.e., partitioning of phenotypic variation into its causal components, is one of the fundamental tasks in quantitative genetics. For multiple characteristics of interest, this involves estimation of covariance matrices due to genetic, residual, and possibly other random effects. It is well known that such estimates can be subject to substantial sampling variation. This holds especially for analyses comprising more than a few traits, as the number of parameters to be estimated increases quadratically with the number of traits considered, unless the covariance matrices of interest have a special structure and can be modeled more parsimoniously. Indeed, a sobering but realistic view is that "Few datasets, whether from livestock, laboratory or natural populations, are of sufficient size to obtain useful estimates of many genetic parameters" (Hill 2010, p. 75). This not only emphasizes the importance of appropriate data, but also implies that a judicious choice of methodology for estimation-which makes the most of limited and precious records available-is paramount.

A measure of the quality of an estimator is its "loss," i.e., the deviation of the estimate from the true value. This is an aggregate of bias and sampling variation. We speak of improving an estimator if we can modify it so that the expected loss is lessened. In most cases, this involves reducing sampling variance at the expense of some bias-if the additional bias is small and the reduction in variance sufficiently large, the loss is reduced. In statistical parlance "regularization" refers to the use of some kind of additional information in an analysis. This is often used to solve ill-posed problems or to prevent overfitting through some form of penalty for model complexity; see Bickel and Li (2006) for a review. There has been longstanding interest, dating back to Stein (1975) and earlier (James and Stein 1961), in regularized estimation of covariance matrices to reduce their loss. Recently, as estimation of higher-dimensional matrices is becoming more ubiquitous, there has been a resurgence in interest (e.g., Bickel and Levina 2008; Warton 2008; Witten and Tibshirani 2009; Ye and Wang 2009; Rothman et al. 2010; Fisher and Sun 2011; Ledoit and Wolf 2012; Deng and Tsui 2013; Won et al. 2013). In particular, estimation encouraging sparsity is an active field of research for estimation of covariance matrices (e.g., Pourahmadi 2013) and in related areas, such as graphical models and structural equations.

Improving Estimates of Genetic Parameters

As emphasized above, quantitative genetic analyses require at least two covariance matrices to be estimated, namely due to additive genetic and residual effects. The partitioning of the total variation into its components creates substantial sampling correlations between them and tends to exacerbate the effects of sampling variation inherent in estimation of covariance matrices. However, most studies on regularization of multivariate analyses considered a single covariance matrix only and the literature on regularized estimates of more than one covariance matrix is sparse. In a classic article, Hayes and Hill (1981) proposed to modify estimates of the genetic covariance matrix ðSGÞ by shrinking the canonical eigenvalues of SG and the phenotypic covariance matrix ðSPÞ toward their mean, a procedure described as "bending" the estimate of SG toward that of SP: The underlying rationale was that SP; the sum of all the causal components, is typically estimated much more accurately than any of its components, so that bending would "borrow strength" from the estimate of SP; while shrinking estimated eigenvalues toward their mean would counteract their known, systematic overdispersion. The authors demonstrated by simulation that use of "bent" estimates in constructing selection indexes could increase the achieved response to selection markedly, as these were closer to the population values than unmodified estimates and thus provided more appropriate estimates of index weights. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.