Natural Selection and the Distribution of Identity-by-Descent in the Human Genome

Article excerpt

ABSTRACT

There has recently been considerable interest in detecting natural selection in the human genome. Selection will usually tend to increase identity-by-descent (IBD) among individuals in a population, and many methods for detecting recent and ongoing positive selection indirectly take advantage of this. In this article we show that excess IBD sharing is a general property of natural selection and we show that this fact makes it possible to detect several types of selection including a type that is otherwise difficult to detect: selection acting on standing genetic variation. Motivated by this, we use a recently developed method for identifying IBD sharing among individuals from genome-wide data to scan populations from the new HapMap phase 3 project for regions with excess IBD sharing in order to identify regions in the human genome that have been under strong, very recent selection. The HLA region is by far the region showing the most extreme signal, suggesting that much of the strong recent selection acting on the human genome has been immune related and acting on HLA loci. As equilibrium overdominance does not tend to increase IBD, we argue that this type of selection cannot explain our observations.

(ProQuest: ... denotes formulae omitted.)

IN recent years there has been considerable interest in detecting natural selection in humans (Bustamante et al. 2005; Nielsen et al. 2005; Voight et al. 2006; Sabeti et al. 2007; Nielsen et al. 2009; Pickrell et al. 2009). Many of the existing methods for detecting ongoing selection on a new allele have focused on haplotype homozygosity (Sabeti et al. 2002; Voight et al. 2006; Zhang et al. 2006; Sabeti et al. 2007), for instance, integrated haplotype score (iHS). The reasoning behind this is that as a favored allele increases in frequency, the region in which the mutation occurs will increase in homozygosity and experience less intra-allelic recombination at the population level. Positively selected alleles increasing in frequency will, therefore, tend to be located on haplotypes that are unexpectedly long, given their frequency in the population. Alleles on different homologous chromosomes are identical-by-descent (IBD) if they are direct copies of the same ancestral allele. Methods for detecting selection on the basis of haplotype homozygosity are, therefore, fundamentally identifying regions where a subset of the individuals share IBD haplotypes that are longer than would be expected under neutrality. Hence, detecting selection using haplotype homozygosity can be viewed as a special case of using excess IBD sharing to detect selection.

Recently several authors (Purcell et al. 2007;Thompson 2008; Albrechtsen et al. 2009; Gusev et al. 2009) have developed methods that are able to infer IBD tracts shared between pairs of individuals from outbred populations without any pedigree information, using dense genotype data such as SNP chip data. The original purpose of these methods was to identify regions of the genome-harboring disease loci with more IBD sharing among affected individuals than unaffected individuals. However, these methods can also be used to study the genetic history of the populations. As we show, they provide a new approach for detecting very recent, strong selection in the genome; and not only selection acting on a new allele, but also selection acting on standing variation, i.e., selection on alleles that were already segregating in the population when the selective advantage was introduced. This is true because natural selection in general will increase the amount of IBD sharing in a population in the area surrounding the allele under selection.

Most population genetic studies have focused on selection acting on a new allele, but it has recently been suggested that selection on standing variation is a biologically relevant model for selection as well (Orr and Betancourt 2001; Innan and Kim 2004; Hermisson and Pennings 2005; Przeworski et al. …