Nonparametric Estimation and Regression Analysis with Left-Truncated and Right-Censored Data

Article excerpt

In many prospective and retrospective studies, survival data are subject to left truncation in addition to the usual right censoring. For left-truncated data without covariates, only the conditional distribution of the survival time Y given Y [is greater than or equal to] [tau] can be estimated nonparametrically, where, [tau] is the lower boundary of the support of the left-truncation variable T. If the data are also right censored, then the conditional distribution can be consistently estimated only at points not larger than [[tau].sup.*], where [[tau].sup.*], is the upper boundary of the support of the right-censoring variable C. In this article we first consider nonparametric estimation of trimmed functionals of the conditional distribution of Y, with the trimming inside the observable range between [tau] and [[tau].sup.*]. We then extend the approach to regression analysis and curve fitting in the presence of left truncation and right censoring on the response variable Y. Asymptotic normality of M estimators of the regression parameters derived from this approach is established, and the result is used to construct confidence regions for the regression parameters. We also apply our methods of nonparametric estimation, coffelation analysis, and curve fitting for left-truncated and right-censored data to analyze transfusion-induced AIDS data, and present a simulation study comparing our approach with another kind of M estimators for regression analysis in the presence of left truncation and right censoring.

KEY WORDS: Bias correction; Bootstrap; Censoring, M estimator; Regression; Synthetic data; Truncation.


Recent interest has focused on survival data that are both truncated on one side and censored on the other. Left-truncated data that are also right censored arise in prospective studies of the natural history of a disease, where a sample of individuals who have experienced an initial event [E.sub.1] (such as being diagnosed as having the disease) enter the study at chronological time t'. The study is temiinated at time [t.sup.*] > t'. For each individual, the survival time of interest is Y = [t.sub.2] - [t.sub.1], where [t.sub.2] is the chronological time of occurrence of a second event [E.sub.2] (such as death) and [t.sub.1] is the chronological time of occurrence of event [E.sub.1]. Because subjects cannot be recruited into the study unless [t.sub.2][is greater than or equal to] t', Y is truncated on the left by T = t' - [t.sub.1] (i.e., a subject is observed only if Y [is greater than or equal to] T), and then only the minimum [caret] C of Y and the right-censoring variable C = [t.sup.*] - [t.sub.1] is observed along with T and the indicator I(Y [is less than or equal to] C) of being uncensored. Individuals in the study population who experienced [E.sub.2], as well as [E.sub.1] prior to the initiation of the study are not observed at all in the study. Such left-truncated and right-censored (LTRC) data have been studied by Hyde (1977); Tsai, Jewell, and Wang (1987); Keiding, Holst, and Green (1989); Wang (1991), Lai and Ying (1991a,b); Gross and Huber-Carol (1992); Andersen, Borgan, Gill, and Keiding (1993), among others.

Right-truncated data occur naturally in retrospective studies of a disease when the only data available are those pertaining to individuals that have experienced [E.sub.2] prior to [t.sup.*]. Here recruitment of individuals into the study occurs only at the time [t.sub.2] of occurrence of the second event [E.sub.2] if [t.sub.2] [less than or equal to] [t.sup.*]. At time [t.sub.2], the time [t.sub.1] of occurrence of the initial event [E.sub.1] is retrospectively ascertained. The survival time of interest is Y = [t.sub.2] - [t.sub.1], which is observed only if Y [is less than or equal to] T, with T = [t.sup.*] - [t.sub.1] being the right-truncation variable. Kalbfleisch and Lawless (1989) have given many examples of such data, including studies of AIDS incubation times. …