Simulation Testing of Unbiasedness of Variance Estimators

Article excerpt

Suppose that an estimator X and an estimator V of its variance are available. Is the estimator V unbiased?

This question is frequently one aspect of computer simulation studies. A large number, N, of independent replicates ([X.sub.i], [V.sub.i]) are generated, and the average value of the [V.sub.i]'s, [bar]V, is to be used to test the hypothesis [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

If [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] were known, the test could be based on the statistic

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which under the null hypothesis has an asymptotic standard normal distribution. Typically, however, neither [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] nor [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is known. The sample variance [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] can be substituted for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] in the denominator of (1) without changing the limiting distribution of [Z.sub.1]; it would seem natural to substitute [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] in the numerator of (1) as well, and to consider the statistic

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

on the presumption that since [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is a consistent estimator of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], this statistic is a good approximation to [Z.sub.1] and should have the same limiting distribution. This, in fact, is not the case.

In Section 1, I show that Z is asymptotically distributed as a mean-zero normal random variable, but with nonunit variance. In Section 2, two examples illustrate the effect of incorrectly assuming that Z has a standard normal distribution.

Throughout this article, [S.sup.2], [[sigma].sup.2], and [theta] denote the sample variance, true variance, and mean for the random variable that subscripts them. In addition, [[tau].sup.2] denotes the difference between the fourth central moment and the square of the variance. All of these parameters are assumed to be finite.

1. ASYMPTOTIC DISTRIBUTION OF Z

Straightforward algebraic manipulations yield

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Under Ho: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], the central limit theorem states that [Z.sub.1] is asymptotically distributed as a standard normal variate. [Z.sub.2] and [Z.sub.3] also have asymptotic standard normal distributions (this consequence of the U-statistic theorem is problem 3.2.6, p. 76 of Randles and Wolfe 1979).

Since [[alpha].sub.N] and [[beta].sub.N] converge in probability to 1 and 0, several applications of Slutzky's lemma (Bickel and Doksum 1977, p. 461) allow the conclusion that Z has the same limiting distribution as

[Z.sub.1] - [k.sub.1][Z.sub.2],

where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Note that [Z.sub.1] and [Z.sub.2] are not independent; their covariance is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The asymptotic distribution of ([Z.sub.1], [Z.sub.2]) is bivariate normal with zero means, unit variances, and covariances equal to [k.sub.2] (see problem 73, p. 405 of Lehmann 1975). Thus [Z.sub.1] - [k.sub.1][Z.sub.2] (and consequently Z) is asymptotically normally distributed with mean zero and variance given by

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Expressed another way, the asymptotic variance of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which can be consistently estimated using moment estimators. …