The Limits of Theoretical Population Genetics

Article excerpt

THE purpose here is to discuss the limits of theoretical population genetics. This 100-year-old field now sits close to the heart of modern biology. Theoretical population genetics is the framework for studies of human history (REICH et al. 2002) and the foundation for association studies, which aim to map the genes that cause human disease (JORDE 1995). Arguably of more importance, theoretical population genetics underlies our knowledge of within-species variation across the globe and for all kinds of life. In light of its many incarnations and befitting its ties to evolutionary biology, the limits of theoretical population genetics are recognized to be changing over time, with a number of new paths to follow. Stepping into this future, it will be important to develop new approximations that reflect new data and not to let well-accepted models diminish the possibilities.

It is valuable to define this field narrowly. Theoretical population genetics is the mathematical study of the dynamics of genetic variation within species. Its main purpose is to understand the ways in which the forces of mutation, natural selection, random genetic drift, and population structure interact to produce and maintain the complex patterns of genetic variation that are readily observed among individuals within a species. A tremendous amount is known about the workings of organisms in their environments and about interactions among species. Ideally, with constant reference to these facts-the bulk of which are undoubtedly yet to be discovered-theoretical population genetics begins by distilling everything into a workable mathematical model of genetic transmission within a species.

Taking this narrow view precludes the application of theoretical population genetics to studies of long-term evolutionary phenomena. This, instead, is the purview of evolutionary theory. For theoretical population genetics, processes over longer time scales are of interest only insofar as they directly affect observable patterns of variation within species. The focus on current genetic variation came to the fore during the 1970s and 1980s with the development of coalescent theory (KINGMAN 1982, 2000), or the mathematics of gene genealogies. EWENS (1990) reviews this transition from the forwardtime approach of classical population genetics to the new, backward-time approach. It can be seen both in classical work (FISHER 1922; WRIGHT 1931) and in coalescent theory (KINGMAN 1982; HUDSON 1983; TAJIMA 1983), both of which are considered below, that the time frame over which the models of theoretical population genetics apply within a given species is a small multiple of N^sub total^ generations, where N^sub total^ is the total population size, or the count of all the individuals of the species. Looking at gene genealogies in humans, for example, it seems that this means roughly from 104 to 106 years (HARRIS and HEY 1999).

This allows us to suppose that the parameters affecting the species that we wish to model have remained relatively constant over time, compared to the situation in evolutionary theory. For purposes of discussion, consider the following simple model which, with embellishments, might serve to describe any species from Homo sapiens to Bacillus subtilis. The species is divided into D subunits, each of size N, so that the total population size is N^sub total^ = ND. Corresponding to the phenomena listed above, the other parameters of the model are the per-locus, per-generation probability of mutation u, the selective advantage or disadvantage, 5, of some type relative to some other type in the population, and a parameter, m, which determines the extent of population structure.

The subunits in the model are used below to represent D diploid individuals, so that N = 2 is the number of copies of each chromosome within each individual. Note that this departs from the usual notation, in which N is the number of diploid individuals. The reason for this departure is to emphasize the similarities between the diploid model and other models of population structure. …