# Quantile Comparison Functions in Two-Sample Problems, with Application to Comparisons of Diagnostic Markers

Academic journal article
**By Li, Gang; Tiwari, Ram C.; Wells, Martin T.**

*Journal of the American Statistical Association*
, Vol. 91, No. 434

**Publication:**Journal of the American Statistical Association

**Date:**June 1996

**Volume/issue:**Vol. 91, No. 434

## Article excerpt

1. INTRODUCTION

Nonparametric two-sample inference procedures are useful in comparing the responses of treatment and control groups. Additionally, in medical follow-up studies, the data are usually subject to censoring or truncation. Many of the existing two-sample procedures are constructed to detect difference in location. But the median does not determine the entire distribution. One can easily envision an example where the medians of two distributions agree, but the tails of the two distributions are quite different; such is the case with the ovarian carcinoma data of Fleming, O'Fallon, O'Brien, and Harrington (1980). In this article we construct two-sample procedures by comparing a single quantile, a finite set of quantiles, and the entire quantile functions of two distributions using a vertical quantile comparison function. The methods are applicable to a variety of sampling schemes, including those with right censoring and left truncation.

Suppose that F and G are the absolutely continuous distribution functions with common support. The two-sample vertical quantile comparison function and the vertical shift function are defined by

Q(p) = G [convolution] [F.sup.-1](p), 0 [less than or equal to] p [less than or equal to] 1 (1)

and

[Theta](p) = G [convolution] [F.sup.-1](p) - p, 0 [less than or equal to] p [less than or equal to] 1, (2)

where here and throughout, for any nondecreasing function [Psi], the inverse function [[Psi].sup.-1] is defined to be the right-continuous version given by

[[Psi].sup.-1](t) = sup{u: [Psi](u) [less than or equal to] t}. (3)

Note that the percentile-percentile (P-P) plot can be represented in the functional form (1) (see, e.g., Holmgren 1995). If F and G denote the distributions of the survival times under the control and the experimental treatment, then 1 - Q(1/2) is the probability of survival under the experimental treatment beyond the control median, [F.sup.-1](1/2). For sufficiently large value of this probability, say greater than 1/2, the experimenter may either recommend use of the experimental treatment or may decide to further investigate its beneficial effect. The parameter Q(1/2) has been used in the control median test studied by Chakraborti and Mukerjee (1989), Gart (1963), Gastwirth (1968), Gastwirth and Wang (1988), Hettmansperger (1973), Hettmansperger and Malin (1975), and Mathisen (1943), among others. In the presence of no censoring, based on samples of sizes m and n from G and F, Gastwirth (1968) and Chakraborti and Mukerjee (1989) showed that for some fixed p, 0 [less than or equal to] p [less than or equal to] 1,

[Mathematical Expression Omitted],

where [G.sub.m] and [F.sub.n] are the empirical distributions of the two samples,

[Mathematical Expression Omitted],

[Lambda] = [lim.sub.m,n [approaches] [infinity]](m/n), a fixed quantity, and

q(p) = dQ(p)/dp = g([F.sup.-1](p))/f([F.sup.-1](p)),

with f and g denoting the density functions of F and G. For testing the null hypothesis [H.sub.0]: F = G versus [H.sub.1]: F [not equal to] G, for large m and n, the control median test rejects [H.sub.0] in favor of Hi if

[Mathematical Expression Omitted],

where Z([Alpha]/2) is the upper [Alpha]/2 percentile point of a standard normal distribution and [Mathematical Expression Omitted] is a consistent estimate of [[Sigma].sub.0] (p). (For such an estimate [Mathematical Expression Omitted], see Chakraborti and Mukerjee 1989). Gastwirth and Wang (1988) developed the censored version of the control percentile test and obtained its asymptotic distribution using the results of Lo and Singh (1985) on the Bahadur representation of quantiles for censored data. They gave a consistent estimator of the null variance. Our results are based on a weak convergence result, not on Lo and Singh's (1985) representation, which yields only pointwise limit theorems. Furthermore, it is not clear how to extend Gastwirth and Wang's results to the multiple quantile problem that we discuss in Sections 3. …