Academic journal article Genetics

Multivariate Analysis of Genotype-Phenotype Association

Academic journal article Genetics

Multivariate Analysis of Genotype-Phenotype Association

Article excerpt

STUDIES of genotype-phenotype association are central to several branches of contemporary biology and biomedicine, but they suffer from serious conceptual and statistical problems. Most of these studies consist of a vast number of pairwise comparisons between single genetic loci and single phenotypic variables, typically leading-among other reasons-to very lowfractions of phenotypic variance explained by genetic effects ["missing heritability" (Manolio et al. 2009; Eichler et al. 2010)]. Post hoc corrections for multiple testing can lead to a dramatic loss of statistical power and in fact violate standard rules of statistical inference. Biologically more important, most phenotypes are not determined by single alleles, but by the joint effects, both additive and nonadditive, of a number of alleles at multiple loci. With the advent of modern imaging and measurement technology, complex phenotypes, such as the vertebrate brain or cranium, often are represented by large numbers of variables. This further complicates the study of genotype-phenotype association by tremendously increasing the number of pairwise comparisons between genetic loci and phenotypic variables, which may not be meaningful traits per se [for instance, in geometric morphometrics, voxel-based image analysis, and many behavioral studies (Bookstein 1991; Ashburner and Friston 2000; Mitteroecker and Gunz 2009; Houle et al. 2010)]. The genotype-phenotype associations we actually seek are between certain allele combinations from multiple loci and certain combinations of phenotypic variables that bear biological interpretation. The number of such pairs of "latent" allele combinations and phenotypes that underlie the observed genotype-phenotype association depends on the genetic-developmental system under study, but typically is less than the number of assessed loci and phenotypic variables (Hallgrimsson and Lieberman 2008; Martinez-Abadias et al. 2012).

Several methods have been suggested for such a multivariate mapping, including multiple and multivariate regression (Haley and Knott 1992; Jansen 1993; Hackett et al. 2001; de Los Campos et al. 2013), principal component regression (Wang and Abbott 2008), low-rank regression models (Zhu et al. 2014), partial least-squares regression (Bjørnstad et al. 2004; Bowman 2013), and canonical correlation analysis (Leamy et al. 1999; Ferreira and Purcell 2009). We present a multivariate analytic strategy-which we term multivariate genotype-phenotype mapping (MGP)-that embraces and relates all of these methods and that circumvents several of the problems resulting from pairwise univariate mapping and from the multivariate analysis of the loci separately from the phenotypes. Our approach does not primarily aim for the detection and location of single loci segregating with a given phenotypic trait. Instead, we present an approach that identifies patterns of allelic variation that are maximally associated-in terms of effect size-with patterns of phenotypic variation. In this way, we gain insight into the multivariate structure of genotype-phenotype association, including its dimensionality and the clustering of genetic and phenotypic variables within this association- the genetic-developmental properties determining the evolvability of organisms (Wagner and Altenberg 1996; Hendrikse et al. 2007; Mitteroecker 2009; Pavlicev and Hansen 2011).

The Principle of Multivariate Genotype-Phenotype Mapping

Let there be p genetic loci and q phenotypic measurements scored for n specimens. Instead of assessing each of the pq pairwise genotype-phenotype associations, we seek a genetic effect-composed of the additive and nonadditive effects of multiple alleles-onto a phenotypic trait that is a composite of multiple measured phenotypic variables. As these genetic and phenotypic features are not directly measured, but perhaps present in the data, we refer to them as genetic and phenotypic "latent variables," LVG and LVP (Figure 1). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.