Semiparametric Bayesian Analysis of Survival Data

Article excerpt

1. INTRODUCTION

In medical studies interest often centers on the relation of the survival time (time to a particular event or "failure" of interest, such as death or occurrence of a disease) to the explanatory variables (covariates such as treatments, design variables, etc.). Several kinds of survival data are encountered by statisticians in medical studies. The descriptions and examples of some of the different types of survival data (often called event-time data) are given in the next section.

In this article we focus on semiparametric Bayesian models along with associated Bayesian statistical methods. The semiparametric nature of the models allows considerable generality and applicability but enough structure for useful physical interpretation and understanding of particular applications in the medical research. The popularity of semiparametric approaches for analyzing univariate survival data begins with the seminal paper of Cox (1972a) on the proportional hazards model. Following the path of Cox and several other survival analysts after him, we concentrate on semiparametric survival models based mainly on the so-called hazard functions and intensity functions. The advantages of models based on the hazard or intensity functions are explained briefly in the popular book on survival analysis by Cox and Oakes (1984), as are the conventional frequentist methods for analysis of right-censored univariate survival data.

The motivating examples in Section 2 indicate the challenges to the statisticians posed by typical modern day survival data problems with their need for complex models and computational tools sophisticated enough to deal with incomplete data information and other complex data structures. The semiparametric Bayesian methodologies in survival analysis, despite being comparatively new concepts, are experiencing some enormous recent attention by the statisticians due to their potential to fit such complex models.

Section 3 reviews recently developed semiparametric models to deal with different complex survival data structures and designs described in Section 2. Section 4 gives a brief review of the recently developed frequentist statistical methods of analyzing these complex survival datasets, along with the shortcomings and difficulties in making inferences in these situations.

In semiparametric Bayes methods, the nonparametric part of every semiparametric model is assumed to be a realization of a stochastic process summarizing the available prior information about the unknown function. The parametric part is assumed to have a prior distribution with possibly unknown hyperparameters. Section 5 discusses the commonly used prior processes to model the nonparametric part (an unknown function such as a hazard or an intensity function). Section 6 briefly discusses the recent advances in Bayesian computation that may be used to study the complex survival models and data structures encountered in practice. Section 7 presents the suitable semiparametric Bayesian models for the examples considered here using the prior processes from Section 5 and the analyses of different types of survival data using the described Bayesian models and implementing suitable computational tools from Section 6.

Section 8 presents the methods of eliciting the hyperparameters associated with the prior distributions and processes. Section 9 describes recently developed Bayesian methods for studying model assessment and model selection. This section also demonstrates how these Bayesian methods can be extended for different examples and applied to verify important modeling assumptions, identify influential observations, and make model comparisons. We also briefly discuss the advantages of these newly developed methods over other model assessment methods in such complex situations in practice. Section 10 summarizes and concludes with a discussion about future research directions and potential new problems. …