Academic journal article Genetics

A Weighted-Holm Procedure Accounting for Allele Frequencies in Genomewide Association Studies

Academic journal article Genetics

A Weighted-Holm Procedure Accounting for Allele Frequencies in Genomewide Association Studies

Article excerpt

ABSTRACT

In the context of genomewide association studies where hundreds of thousand of polymorphisms are tested, stringent thresholds on the raw association test P-values are generally used to limit false-positive results. Instead of using thresholds based on raw P-values as in Bonferroni and sequential Sidak (SidakSD) corrections, we propose here to use a weighted-Holm procedure with weights depending on allele frequency of the polymorphisms. This method is shown to substantially improve the power to detect associations, in particular by favoring the detection of rare variants with high genetic effects over more frequent ones with lower effects.

THE development of high-throughput genotyping technologies, allied to a deeper understanding of the pattern of human sequence variation, has recently offered the opportunity to perform association studies with hundreds of thousands of single-nucleotide polymorphisms (SNPs) expected to cover the whole genome. Recent successes in human complex diseases such as type II diabetes mellitus (Scott et al. 2007; Sladek et al. 2007; Zeggini et al. 2007), Crohn's disease (Duerr et al. 2006;Riouxet al. 2007),prostate cancer (Yeageret al. 2007), and coronary artery disease (Helgadottir et al. 2007; Samani et al. 2007) provided compelling proof-of-principle. Most recent genomewide association studies (GWAS) identified "at-risk" SNPs with minor allele frequency (MAF)~0.30 and associated with allelic odds ratio~1.30. This could be explained by the fact that the SNP arrays used to perform GWAS were designed under the hypothesis that genetic susceptibility to complex diseases involves common variants that confer moderate to low relative risk (i.e., the common disease-common variants hypothesis) and are thus enriched in common variants. However, examples from the literature (see Cambien and Tiret 2007) show that rare variants with stronger effects may also contribute to the genetic architecture of complex diseases and it might then be of interest to develop statistical methods that will favor their detection over that of most frequent variants.

The statistical analysis of such a large number of SNPs requires imposing stringent thresholds on the association tests' P-values to control for the risk of false positives. In the context ofmultiple-testing procedures(MTP) controlling the familywise error rate (FWER), several methods based on raw P-values including Bonferonni, sequential Sidak (SidakSD), and Holm (Holm 1979) are available. These standard MTPs can be improved by weighting P-values according to some specific criteria to increase power (Benjamini and Hochberg 1997; Kropf and Lauter 2002; Kropf et al. 2004; Westfall et al. 2004). In the field of GWAS, Roeder et al. (2007) have recently shown that grouping SNPs within a priori sets subsequently used for defining weights could considerably improve the power to detect association. Instead of grouping SNPs, we here proposed to directly weight association tests' P-values according to the SNP's MAF estimated in the control sample and to apply the weighted-Holm (WH) procedure (Benjamini and Hochberg 1997). This strategy is expected to favor the detection of rare SNPs with a high odds ratio (OR) over that of frequent SNPs with a moderate OR. A simulation study was carried out to study the statistical properties of the proposed WH procedure and demonstrated that this very simpleWHprocedure can greatly improve the power of GWAS over standard MTPs controlling the FWER.

Following the general notations of Benjamini and Hochberg (1997), we define, for each SNP i (i = 1-N) with MAF fi in the control sample, a standardized weight w^sub i^ as w^sub i^ = (N/f^sub i^)/Σ^sup N^^sub j=1^ 1/f^sub J^.

Let p1, p2, . . . , pN be the P-values of the association test for each SNP i (i = 1, 2, . . . , N) and Pi* = pi=wi be their weighted counterpart. Rank these Pi* in increasing order (P(1)*# . . . #P(N)*) and finally note w(i) * as the standardized weight corresponding to the SNP associated with P(i)*. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.