The Implicit Genome

The Implicit Genome

The Implicit Genome

The Implicit Genome

Synopsis

For over half a century, we have been in the thrall of the double-helicaln structure of DNA, which, in an instant, revealed that information can be transferred between generations by a simple rule, A pairs with T, G pairs with C. In its beautiful simplicity, this structure, along with the table of codons worked out in the following decade, had entranced us into believing that we can fully understand the information content of a DNA sequence, simply by treating it as text that is read in a linear fashion. While we have learned much based on this assumption, there is much we have missed. Far from a passive tape running through a reader, genomes contain information that appears in new forms which create regions with distinct behavior. Some are "gene rich", some mobile, some full of repeats and duplications, some sticking together across long evolutionary distances, some readily breaking apart in tumor cells. Even protein-coding regions can carry additional information, taking advantage of the flexible coding options provided by the degeneracy of the genetic code. The chapters in this volume touch on one or more of three interconnected themes: information can be implied, rather than explicit, in a genome; information can lead to focused and/or regulated changes in nucleotide sequences; information that affects the probability of distinct classes of mutation has implications for evolutionary theory.

Excerpt

The genome is a complex nuclear organelle whose function is to encode the information needed to maintain the living state as cells grow and divide, and as generations pass from parents to progeny. Much attention has focused on the DNA sequences that encode proteins, the workhorse molecules of biochemical metabolism and biological structures. However, these sequences account for only a fraction of the total human genome. Moreover, because of the degeneracy of the genetic code, there are many ways to encode a specific protein. Local DNA structure depends on sequence, as do the mechanical properties, such as bending and twisting flexibility. As a consequence, much more information is encoded in the genome than is accounted for by protein sequences. Deciphering the complex secondary code“s” has only begun.

The B-DNA helix structure proposed by Watson and Crick1–3 has been the dominant icon of molecular biology for half a century. Only purine–pyrimidine base pairs, A with T and G with C, are tolerated in the confines of the regular helical structure, shown in figure 1.1a, as refined from fiber diffraction studies. Inherent in the structure is the logic of its replication, since each strand can serve as the template for synthesis of its complement. Incorporation of deoxyribose in the alternating sugar-phosphate backbone confers polarity on the strands: each sugar has a phosphodiester linkage on its 5' and 3' oxygens. By convention, DNA sequences are written in the direction that corresponds to the order of biosynthesis—from 5' end to 3' end. A key feature of the structure is that the two strands in the duplex are antiparallel: the 5' end of one chain and the 3' end of the other chain are at the same terminus of the duplex. The simplicity of a uniform base-paired helix, with 10 bp per turn, 3.4 Å rise per base pair, was a key element in the rapid acceptance of the structure, and the explosive growth of molecular biology that followed.

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.