Exploring the Confidence Interval for a Binomial Parameter in a First Course in Statistical Computing

Article excerpt


The simple problem of providing a confidence interval for the estimate of a binomial parameter can prove to be quite interesting. There are a suprising variety of competing intervals to choose from, using both frequentist and Bayes methods. A reasonable criterion for comparing these confidence intervals is: How close does the actual coverage probability come to the target coverage probability, over a range of sample sizes and true population proportions? This material lends itself nicely to a first class in statistical computing. The computation of actual coverage probabilities for the standard approximate confidence interval is a good first simulations assignment involving a random number generator and calculations within a loop, and the variety of competing intervals covers a broad spectrum of computational difficulty. The standard approximate confidence interval that is most often found in introductory textbooks has rather poor performance compared to the alternatives. These comparisons give students an op portunity to write a "paper," with tables of simulations results, figures, and the practice of writing introduction, methods, results, and conclusions. This article presents an overview of confidence interval methods, including some simple "fixes" to the standard interval, the Wilson or "score" interval, the "exact" interval, and the Bayes credible intervals. After discussion of each interval, there are suggestions for computing assignments. Our experience shows that these statistical computing assignments, based on a simple and familiar problem, are empowering to the students, especially when they conclude that many textbooks are "wrong" in their recommendations! They go on to tackle more complicated simulations problems, in other classes and in their own research, with more confidence.

KEY WORDS: Bayes credible interval; Exact confidence interval; Population proportion; Prior distribution; Score interval; Statistical simulations; Wilson confidence interval.


One of the simplest problems in statistics is that of estimating a binomial proportion p. If k is a binomial (n, p) random variable--that is, the number of "successes" in n independent trials with probability p success for each trial--then p = k/n is a standard estimate for p. A 95% confidence interval is commonly provided to assess the precision of the estimate. In introductory statistics classes, and often in practice, an approximate confidence interval based on two approximations is provided. These approximations use the central limit theorem and the standard error for p. This interval is known to be "bad;" that is, the actual coverage probability is substantially less than 95%, for "small" sample sizes and for p near zero or one (details in Section 2). We find a variety of cautions as to when this interval is acceptable, and there are "conservative" and "adjusted" variations on the standard approximate interval designed to improve the coverage probability.

In mathematical statistics texts, the "exact" confidence interval, based on inverting the binomial hypothesis test, is usually presented as the small-samples confidence interval. This interval dates back to Clopper and Pearson (1934), and is usually not presented in introductory classes because of computational difficulty. As seen in Section 4, the actual coverage probabilities are typically larger than the target coverage probability. Bayesian inference (discussed in Section 5) has provided the "credible interval," with several choices for priors. These intervals require a computer algorithm to calculate. These intervals have a good claim to the title "exact" under the condition that the population parameter really is sampled from the prior density. The "score" interval discussed in Section 3 dates from Wilson (1927). It has good performance in terms of coverage probability over a broad spectrum of n and p. The derivation is similar to that of the standard approximate interval, except that it uses only one a pproximation: that of the central limit theorem. …