The Likelihood Ratio Test for Poisson versus Binomial Distributions

Article excerpt

1. INTRODUCTION

Carroll and Lombard (1985) considered the problem of estimating the parameter N based on n independent success counts [x.sub.1],... ,[x.sub.n] from a binomial distribution with unknown parameters N and p. They extended earlier work by Olkin, Petkau, and Zidek (OPZ; 1981) on the maximum likelihood estimators (MLE's) of those parameters, by introducing new estimators of N based on integrating out p from the likelihood with respect to a beta distribution, yielding a beta-binomial distribution for the number of successes. The new estimators maximizing this likelihood compared favorably in mean squared error terms with the OPZ estimators. Hall (1994) gave an asymptotic analysis in which the true values of N and p tend to [infinity] and zero, but in such a way that their product tends to a nonzero limit. He showed that very heavy-tailed, nonnormal limit distributions, such as the Cauchy, can be obtained for the MLE and the method-of-moments estimators of N and p, but that the Carroll-Lombard estimator has a light-tailed asymptotic distribution with all moments finite.

Aitkin and Stasinopoulos (1989) considered the same problem with different methods of eliminating p, and obtained inferences about N that are dramatically different from those of OPZ (1981). The nub of the problem seems to be that the likelihood may be maximized on the boundary N = [infinity], suggesting a Poisson rather than a binomial distribution for the data. Data sets yielding a large estimate for N and a small estimate for p display "unstable" or "erratic" behavior of the MLE or the moments estimator, which the Carroll-Lombard method tries to overcome. Here we investigate a likelihood-based analysis of the data akin to that of Aitkin and Stasinopoulos (1989), which we think more clearly reveals its nature, free from assumptions about prior distributions and assumptions on N and/or p.

Suppose that [x.sub.1],..., [x.sub.n] are observations on n (iid) random variables [X.sub.1],..., [X.sub.n] whose common distribution is either binomial or Poisson. We will derive the asymptotic distribution of the likelihood ratio (LR) test for the null hypothesis that the [X.sub.i] are Poisson against the alternative hypothesis that they are binomial. For this purpose, the true distribution of the [X.sub.i] is assumed to be Poisson, so we are required to test at the boundary N = [infinity] of the parameter space. Of course, we expect a nonstandard asymptotic distribution for the LR test in this case.

In the next section we state the main results, showing that an MLE [Mathematical Expression Omitted] of N exists and is unique with probability approaching 1 as n [approaches] [infinity], and deriving its asymptotic distribution under the null hypothesis. (The proofs of these results are given in the Appendix.) In Section 3 we report on some simulations investigating how accurate the asymptotic distributions are in finite samples. It turns out that the rate of convergence of the log-likelihood ratio statistic to its asymptotic distribution is very slow, so we discuss the application of our method and compare it to alternative approaches.

Our likelihood analysis was called the profile likelihood approach by Aitkin and Stasinopoulos (1989). (They also suggested the use of a conditional likelihood, which we discuss in Sec. 4.) Our results suggest the following approach to the analysis of this kind of data. First, test the null hypothesis [H.sub.0] that the data are from a Poisson rather than a binomial distribution. If we accept [H.sub.0], then of course we do not attempt to estimate N but rather apply methods appropriate to Poisson data. On the other hand, if we reject [H.sub.0], then we can proceed to estimate N and draw inferences from the sample on this basis. We provide further details of this method in Section 4.

2. THE MAIN RESULTS

Consider n iid random variables [X.sub.1],..., [X.sub.n] whose common nondegenerate distribution F is either binomial or Poisson. …