nucleic acid, any of a group of organic substances found in the chromosomes of living cells and viruses that play a central role in the storage and replication of hereditary information and in the expression of this information through protein synthesis. In most organisms, nucleic acids occur in combination with proteins; the combined substances are called nucleoproteins. Nucleic acid molecules are complex chains of varying length. The two chief types of nucleic acids are DNA (deoxyribonucleic acid), which carries the hereditary information from generation to generation, and RNA (ribonucleic acid), which delivers the instructions coded in this information to the cell's protein manufacturing sites.
A substance that he called nuclein (now known as DNA) was isolated by 1869 by Friedrich Miescher, but it was only in the last half of the 20th cent. that that research revealed its significance as the material of which the gene is composed, and thus its function as the chemical bearer of hereditary characteristics. RNA was first made by laboratory synthesis in 1955. In 1965 the nucleotide sequence of tRNA was determined, and in 1967 the synthesis of biologically active DNA was achieved. The amount of RNA varies from cell to cell, but the amount of DNA is normally constant for all typical cells of a given species of plant or animal, no matter what the size or function of that cell. The amount doubles as the chromosomes replicate themselves before cell division takes place (see mitosis); in the ovum and sperm the amount is half that in the body cells (see meiosis).
The chemical and physical properties of DNA suit it for both replication and transfer of information. Each DNA molecule is a long two-stranded chain. The strands are made up of subunits called nucleotides, each containing a sugar (deoxyribose), a phosphate group, and one of four nitrogenous bases, adenine, guanine, thymine, and cytosine, denoted A, G, T, and C, respectively. A given strand contains nucleotides bearing each of these four. The information carried by a given gene is coded in the sequence in which the nucleotides bearing different bases occur along the strand. These nucleotide sequences determine the sequences of amino acids in the polypeptide chain of the protein specified by that gene.
Between the genes, or coding loci, on the DNA of higher organisms, there are long portions of DNA, often referred to as "junk" DNA, that code no proteins. Sometimes junk DNA occurs within a gene; when this occurs, the coding portions are called exons and the noncoding (junk) portions are called introns. Junk DNA makes up 97% of the DNA in the human genome. Little is known of its purpose.
In 1953 the molecular biologists J. D. Watson, an American, and F. H. Crick, an Englishman, proposed that the two DNA strands were coiled in a double helix. In this model each nucleotide subunit along one strand is bound to a nucleotide subunit on the other strand by hydrogen bonds between the base portions of the nucleotides. The fact that adenine bonds only with thymine (A—T) and guanine bonds only with cytosine (G—C) determines that the strands will be complementary, i.e., that for every adenine on one strand there will be a thymine on the other strand. It is the property of complementarity between strands that insures that DNA can be replicated, i.e., that identical copies can be made in order to be transmitted to the next generation.
RNA and Protein Synthesis
In order to be expressed as protein, the genetic information must be carried to the protein-synthesizing machinery of the cell, which is in the cell's cytoplasm (see cell). One form of RNA mediates this process. RNA is similar to DNA, but contains the sugar ribose instead of deoxyribose and the base uracil (U) instead of thymine. To initiate the process of information transfer, one strand of the double-stranded DNA chain serves as a template for the synthesis of a single strand of RNA that is complementary to the DNA strand (e.g., the DNA sequence AGTC will specify an RNA sequence UCAG). This process is called transcription and is mediated by enzymes.
The newly synthesized RNA, called messenger RNA, or mRNA, moves quickly to bodies in the cytoplasm called ribosomes, which are composed of two particles made of protein bound to ribosomal RNA, or rRNA. Each ribosome is the site of synthesis of a polypeptide chain. Several ribosomes attach to a single mRNA so that many polypeptide chains are synthesized from the same mRNA; each cluster of an mRNA and ribosomes is called a polyribosome or polysome. The nucleotide sequence of the mRNA is translated into the amino acid sequence of a protein by adaptor molecules composed of a third type of RNA called transfer RNA, or tRNA. There are many different species of tRNA, with each species binding one of 20 amino acids.
In protein synthesis, a nucleotide sequence along the mRNA does not specify an amino acid directly; rather, it specifies a particular species of tRNA. For example, in coding for the amino acid tyrosine, a nucleotide sequence of mRNA is complementary to a portion of a tyrosine-tRNA molecule. As each specified tRNA associates with its complementary space on the mRNA, the amino acid is added onto the lengthening protein chain and the tRNA is released. When the protein chain is complete, it is released from the ribosome.
The particular sequence of amino acids in each polypeptide chain is determined by the genetic code. Starting at one end of the mRNA strand, each 3-nucleotide sequence, or codon, specifies, via complementary tRNA sequences, one amino acid, and the series of such codons in the mRNA specifies a polypeptide chain. Although a "vocabulary" of 64 words, or specifications, is theoretically possible with 4 different nucleotides taken three at a time, there are only 20 amino acids to be specified. However, several triplets may code for the same amino acid; for example UAU and UAC both code for the amino acid tyrosine. In addition, there are some codons that do not code for amino acids but code for polypeptide chain initiation and polypeptide chain termination. The code is also nonoverlapping; i.e., a nucleotide in one codon is never part of either adjacent codon. The code seems to be universal in all living organisms.
The determination of the mechanism of protein synthesis has increased understanding of many genetic processes and permitted such developments as bioengineering. Some mutagens, or mutation-inducing agents, cause the substitution of one nucleotide for another in an mRNA strand; other mutagens cause deletion or addition of nucleotides. Decoding, or reading, of such strands will be altered.
Metabolic regulation has been studied to determine how the genes that control enzyme synthesis can be switched on and off when certain substances are present. For example, in the process known as induction, bacteria synthesize the enzyme β-galactosidase only when lactose is present. Induction has been linked to the activity at a so-called operator site on a chromosome. When the operator site is open, the genes it controls function freely; when it is blocked, as by a repressor molecule, the genes it controls also do not function.
See J. D. Watson, The Double Helix: A Personal Account of the Discovery of the Structure of DNA (1968) and DNA: The Secret of Life (2003); R. L. Adams et al., ed., The Biochemistry of the Nucleic Acids (1986); V. K. McElheny, Watson and DNA: Making a Scientific Revolution (2003); I Rosenfield et al., DNA: A Graphic Guide to the Molicule that Shook the World (2010).