Academic journal article Genetics

Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation

Academic journal article Genetics

Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation

Article excerpt

ADVANCES in sequencing technologies are rapidly making whole-genome sequencing of germline or somatic DNA routinely available for prognostic and diagnostic purposes. During the past decade and more, millions of singlenucleotide variants have been identified as the most common type of genetic difference, both among individuals (International HapMap Consortium 2005; Cotton et al. 2008; Abecasis et al. 2012) and between different somatic cells within an individual (Cancer Genome Atlas Research Network 2008; Campbell et al. 2015). Single-nucleotide variants are a substitution of one DNA base pair for another and may fall within genes (either protein-coding or functional RNA genes) in gene regulatory regions or in intergenic regions. Substitutions in the coding sequence of protein-encoding genes can be either synonymous (i.e., they encode the same amino acid due to redundancy/degeneracy in the genetic code and so have no effect on the protein product of a gene) or nonsynonymous (i.e., they change a single amino acid in the protein). Here, we focus specifically on nonsynonymous genetic variants (NSVs), of which there are an average number of ^3000 per individual genome (Abecasis et al. 2012).

Proteins, either alone or in complex with other cellular molecules, comprise molecular "machines" that function at the biochemical level. An NSV by definition changes the sequence of a protein. However, only a subset of NSVs have a damaging functional effect (i.e., affecting the biochemical activity or regulatory control of a protein), as proteins are large molecules and their structures can be quite robust to single-site mutations. Note that the term "damaging" does not necessarily imply an impairment of a protein'sbiochemical activity-in some cases a NSV that increases a protein's biochemical activity can have a negative effect on the protein's ability to properly serve one of its biological roles. In turn, some, but not all, damaging NSVs will be deleterious, meaning that they result in a phenotype at the organism level that is subject to natural selection (specifically, negative selection). Disease-causing, or pathogenic, NSVs obviously have a phenotypic effect, which may be subject to natural selection but is not necessarily so. Thus pathogenic NSVs are very often but not necessarily deleterious in the strict sense. Finally, most common (high frequency in a population)NSVs,andmanyifnotmostrareNSVs,havenoappreciable deleterious or pathogenic effect and are called "neutral."

Thus, the challenge of NSV impact prediction can be stated simply as a needle-in-the-haystack problem: most NSVs carried by an individual are neutral, so we need ways to predict the relatively few NSVs that will, upon closer investigation, turn out to be deleterious or pathogenic. Of course, genetic variation outside of protein-coding regions can also have phenotypic consequence, and with projects such as ENCODE now generating hypotheses about potential regulatory regions of the human genome (Encode Project Consortium 2012), methods for identification of disease-relevant regulatory variants is currently a major focus. Nevertheless, because of the clear mechanism by which NSVs can impact biological function and therefore phenotype, NSV prioritization remains an active area of research in which improvements are still required to meet the demands of precision genomic medicine (Fernald et al. 2011; Shendure and Akey 2015).

Computational methodologies for predicting the impact of NSVs fall into four main categories: sequence conservationbased, structure analysis-based, combined (including both sequence and structure information), and meta-prediction (predictors that integrate results from multiple predictors) approaches (Figure 1). We first review the foundations of SNV prediction methods in protein sequence and structure analysis. We then discuss each of the categories of computational prediction method in more detail, describing the basic principles underlying each approach and the differences between specific computational tools that have been developed in each area. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.