Back in the late 1970s, I had the pleasure of visiting the laboratory--or perhaps more properly the lair--of Arthur E. Mourant. It was hidden away in the far recesses of the British Museum of Natural History in London. Mourant, a genial man who looks rather like Mr. Punch, presided over a large room lined with cabinets filled to overflowing with papers. For decades, he and a few devoted co-workers had kept track of our growing knowledge of the human gene pool, summarizing the work of thousands of scientists in huge compendiums. He had provided scientists working on human evolution and variation with a distillation of studies that had been written in a dozen languages, in a hundred parts of the world. We spent a couple of fascinating days going over some of the reams of data that he had collected and speculating about their meaning. Among other things, he showed me the proofs of a new book he had just finished on human genetic variation and disease.
The gray columns of figures in this book were a treasure trove. The first connection between stomach cancer and the ABO blood groups had been published in 1953. By the time Mourant summarized the literature in 1978, an astonishing 5,000 studies had looked for connections between ABO blood groups and virtually every major disease. About 15 percent of them showed an association.
Other gray columns of his figures told about another, less-known human blood group called MN, which is confined to the surface of our red cells. So minor is it that it is usually ignored by our own immune system and, unlike ABO blood groups, it is not important in transfusion or tissue rejection. Strenuous efforts by many researchers have not been able to detect any association between the MN blood groups and disease.
Yet virtually every human population has the M and N forms of this trait in varying proportions. Why are both so pervasive, and why is not our entire species type M or type N? Is it simply accidental or are selective forces at work? And what does the distribution of these and other variant forms of genes tell us about the history and current state of our species? What indeed can it tell us if all the genes that have been discovered turn out to be as different as ABO and MN?
A new book by Cavalli-Sforza and his collaborators, as massive as anything put together by Mourant, attempts to answer some of these questions. It is an immense and laudable undertaking that pulls together the information on many genes that, like the ABO blood group gene, are polymorphic--that is, they exist in the population in a variety of types called alleles. Much of the data had been gathered in raw form by Mourant, with later additions by Mourant's co-workers and by Cavalli-Sforza's group. More than 75,000 allele frequencies, measuring the prevalence of various alleles in nearly 7,000 human populations, are summarized--not in the gray columns of Mourant's compilation but in the form of maps and statistical analyses that make trends in the data far more obvious and accessible.
The book begins with a survey of the methods used in analyzing the data and then moves on to an overview of the genetic and cultural histories of our species on a worldwide scale. Succeeding chapters deal with each continent in turn. The book is nothing less than an attempt to relate the physical appearance, language, and culture of the far-flung members of our extremely variable species to the evidence of the genes. In the course of this titanic enterprise, the book summarizes how much we have learned and shows how far we still have to go.
What are the many controversies that the book hopes to cast light on? One is the origin of our species itself. Did we arise within the last one or two hundred thousand years in Africa and spread throughout the rest of the Old World, sweeping all the poor hominids already resident there into the ash heap of history? Or did we arise from our immediate ancestor, Homo erectus, in a series of parallel events in various parts of the Old World, aided perhaps by puzzling and highly specific flows of genes conferring human rather than prehuman characteristics on our diverse ancestors? While admitting that all the evidence is not in, the authors come down on the side of a single origin.
A second question that the authors spend a good deal of time on is the matter of races. While our species is highly diverse both physically and genetically, the patterns are so complex that it is impossible to divide us into races in any consistent way. For example, an earlier generation of anthropologists classified the Ainu of northern Japan as Caucasian because of the abundant hair on their bodies, the lack of an epicanthic fold on the upper eyelid, their wavy brown hair, and pale skin. But their genes place them squarely among the peoples of eastern Asia. The San (Bushmen), at the other end of the Old World, in southern Africa, have flattened faces of rather Asian appearance--though again without an epicanthic fold--and yellow rather than dark skin. Yet the frequencies of their various alleles, although unusual in some respects, resemble those of their African neighbors.
The authors do not attempt an explanation. But I suspect that since our species is blessed with an abundant variety of alleles of genes that contribute to outward appearance, a little mixing, matching, and sorting out would have been quite enough to have produced--anywhere on the planet--the relatively trivial differences in appearance on which we put so much weight when we classify people into races.
A third question concerns the various patterns of migration our recent ancestors took as they roamed over Africa, Europe, the Middle East, and far Asia. Can the traces of these migrations be detected by looking at allele frequencies, or does the spread of culture overwhelm the spread of genes? A striking cline--or regular gradation--of frequencies of alleles of some genes, such as ABO, extends across Europe and correlates in space and time with the spread of farming from its source in the Middle East. A likely explanation is that farmers, able to multiply in numbers faster than their neighboring hunter-gatherers, overwhelmed and mixed with them. This new, slightly mongrelized group of farmers repeated the process as farming spread to the north and west. On the other hand, a much more recent spread of Bantu-speaking peoples accompanied by agriculture in southern Africa has not left such obvious traces on the genes.
And finally, is there any correlation between the traces of migration seen in some of our genes and the spread and history of human languages? Sometimes. Again, a fairly striking correlation is found in Europe. In Australia, however, no correlation can be seen among the genes of the aboriginal populations, the distribution of their languages, and the fairly simple patterns of colonization from Australia's north that we know must have taken place starting some 60,000 years ago.
The book is unlikely to settle any of these controversies and, indeed, is certain to stir some because of its unabashedly idiosyncratic methods. The authors state at the outset that they are going to concentrate on their own methods of interpreting the data, because giving full justice to the approaches of others would make the book far longer than its current thousand or so very large pages. The authors are to be commended, however, for laying out all the data, warts and all, showing how they analyzed it, and hedging virtually all their conclusions with the caveats that imperfect data demand.
The first problem with this compilation, impressive as it is, has to do with the immense span of time for which we have no genetic data. Because our genetic portrait of humankind is necessarily based on recent samplings, it is unavoidably static. Historical records of human migrations cover only a tiny fraction of the history of our species, and we know surprisingly little about how long most aboriginal peoples have occupied their present homes. Language, too, is so labile and so easily overwhelmed by history that it can only take us, at the most, ten or twenty thousand years into the past. We are pretty close to the position of a viewer who tries to infer the entire plot of the film Queen Christina from the few final frames showing Garbo's rapt face.
Once we have looked as far back as we can--to the invention of agriculture and a little way into the Neolithic--how much more of our distant history can we infer from the present-day distribution of alleles? Very little, I think. As the authors acknowledge throughout the book, the distribution of genes has as many explanations as there are genes themselves.
Take the Duffy blood group. One allele of this gene confers absolute immunity against a particular type of malaria. This allele is present in sub-Saharan Africa because of malaria--not migration--and may have made its appearance only tens of thousands of years ago. The ABO polymorphism, on the other hand, is millions of years old, and therefore probably far more complex than that of Duffy. Even though we have known about it for the better part of a century, we have still not managed to discover the major reason that we (and our close relatives the great apes) have this polymorphism.
Many of the maps in The History and Geography of Human Genes were constructed with a technique called principal component, or PC, analysis, which sounds--and is--dauntingly statistical. To construct one of the maps, eighty-two genes were examined in many populations throughout the world. Each population was represented on a computer grid as a point in eighty-two dimensional space, with its position along each dimensional axis representing the frequency of one of the alleles in question. The line, rotated through all the dimensions, that best fits all the points is called the first PC. It can also be understood as the measure that best summarizes all the variables. Other PCs can be obtained that summarize the leftover data.
Suppose that all our genes behaved the same way--that is, they all had alleles with a high frequency in Africa, intermediate frequencies in Europe and Asia, and even lower frequencies in Australia. Then the first PC would account for all or most of the information in the data set. It is just such a pattern that the eighty-two-gene map appears to show. This is misleading, however, because the first PC accounts for only about a third of the data, and the other two thirds are made up of conflicting trends. Which, if any of them, do we believe?
Unfortunately, the authors tend to search through the various PC maps until they find one that supports the argument they are trying to make at the moment. I rather wish that they had played around with the data a little more in order to see how robust the maps are. For example, how much does a PC map change if one important allele like the Duffy variant is dropped from it? The authors emphasize migration, and while they sometimes suggest that selection for or against particular alleles and combinations of alleles in different regions may have played a role in shaping these maps, my guess is that such selection will turn out to be at least as important as migration.
The book closes with a plea to gather irreplaceable genetic information from indigenous peoples before they are killed, die of starvation and disease, or melt anonymously into the favelas of Third World cities. At times, the argument sounds uncomfortably like science-at-all-costs, a plea for "immortalizing" the white blood cells of peoples on the brink of extinction as the peoples themselves fade away. But such efforts should not, I think, be supported unless they form a part (a small part) of efforts for cultural preservation and political empowerment of the kind espoused by the Cambridge-based group Cultural Survival, and of efforts to shift priorities at the World Bank and among the Third World governments directly concerned.
The ABO blood groups were discovered in the year 1900. The History and Geography of Human Genes, arriving nearly a hundred years later, gathers together much of the information that has since been gleaned about human diversity and allows us to see, however dimly, a small part of our evolutionary heritage. The book summarizes this exciting story well, but the really exciting discoveries are still in the future. In the next hundred years we will find the genes that distinguish us from the great apes and perhaps discover how some of them work. And we will, I feel confident, finally be able to determine which one of the many conflicting theories about the evolutionary history of our unique species is correct.
Christopher Wills is a professor of biology at the University of California, San Diego. His books include Exons, Introns and Talking Genes: The Science Behind the Human Genome Project and, most recently, The Runaway Brain: The Evolution of Human Uniqueness.…