Distribution of Rankings for Groups Exhibiting Heteroscedasticity and Correlation

Article excerpt

1. INTRODUCTION

Consider a population consisting of p groups, for which [theta] is a p vector of constants with [[theta].sub.i] describing the ith group, and suppose that there is interest in the relative magnitudes of [[theta].sub.1],...,[[theta].sub.p]. This is a situation arising frequently in the analysis of two-way data. The constants are presumed to be unobservable and are estimated by some [[theta].sub.1]...,[[theta].sub.p], which have statistical rankings [r.sub.1],...,[r.sub.p], defined (in terms of ascending order) as the smallest numbers in the set {1,..., p} such that [r.sub.i]<[r.sub.j] if [[theta].sub.i]<[[theta].sub.j], [r.sub.i] = [r.sub.j] if [[theta].sub.i] = [[theta].j], for i, j = l,...,p.

The probability distribution of rankings, given by P([r.sub.i] = j), i, j, = 1,..., p, is useful in describing the success likelihood of competitors in racing events, and variations on this basic theme are common in many fields, including economics [e.g., bidding on contracts (see, e.g., Engel, Fischer, and Galetovic 2001) and global economic competition (Amato and Amato 2001)], and biology [rival contests in mating (Radcliffe and Rass 1998)]. Of particular interest is the ranking probabilities that describe the likelihood of outcomes in some future experiment for groups that may exhibit heterogeneity both in the parameters [[theta].sub.i] of interest and in other parameters (variances) and may exhibit mutual dependence.

Ranking probabilities are described for the case in which where the estimator [theta] is well approximated by a normal distribution, in a finite sample or asymptotically, with attention given to intergroup heteroscedasticity and correlation. Hence interested is not only in the group constants [[theta].sub.1],...,[[theta].sub.p], but also in the joint contribution of these and other parameters (variances and covariances) to the ranking distribution. In comparison, a great deal of previous work has focused on the problem of multiple inferences and simultaneous confidence intervals for [[theta].sub.1]..., [[theta].sub.p] (Hochberg and Tamhane 1987; Toothaker 1993; Hsu 1996). Other work on the selection problem has developed and evaluated methods for picking the best group (in population terms) using sample statistics (Bechhofer, Elmaghraby, and Morse 1959; Gibbons, Olkin, and Sobel 1977; Gupta and Panchapeakesan 1979; Lam 1989; Hoppe 1993; Mukhopadhyay and Solanky; 1994 Bechofer, Santner, and Goldsman 1995). The selection literature largely deals with the case in which group statistics [[theta].sub.1],..., [[theta].sub.p] are independent and differ in distribution only in [[theta].sub.1],..., [[theta].sub.p] with the focus on [theta] and decision procedures related to [theta]. Recently, Gupta and Miescke (1988), Nelson and Matejcik (1995), Nelson, Swann, Goldsman, and Song (2001) and Kim and Nelson (2001) have developed selection methods that allow for beteroscedastic and correlated groups.

The range of possible ranking distributions is quite diverse under normality and is analytically complex. By exploiting symmetries and the elliptical geometry of the normal distribution, the range of feasible ranking probabilities can be partially characterized under the hypothesis of equal population rankings ([[theta].sub.1] = [[theta].sub.2] = ... = [[theta].sub.p]) and under the alternative. The presence of intergroup heteroscedasticity and/or correlation greatly increases the range of feasible ranking probabilities under each hypothesis. A full characterization of the possible means and covariances consistent with a given ranking distribution is beyond the scope of this article.

Estimating the ranking distribution requires an historical sample that is suitably large relative to the planned sample. In addition to nonparametric methods based on historical event frequencies, a class of parametric estimators is proposed that includes a "plug-in" method, in which historical moment estimates are substituted for population moments in generating normal ranking probabilities, as well as methods that incorporate Bayesian uncertainty about the moment values. …