Inference and Predictions from Poisson Point Processes Incorporating Expert Knowledge

Article excerpt

1. INTRODUCTION

The use of the nonhomogeneous Poisson point process (NHPP) has recently gained popularity in failure data analysis as a simple and versatile tool to assess the growth or decay of reliability of a complex system and to predict the life behavior of such systems. Special chapters in books such as those by Thompson (1988) and by Crowder, Kimber, Smith, and Sweeting (1991) have been devoted to this topic. But most of the existing inference and prediction procedures for the NHPP have been based on the sample theory approaches, especially those involving the method of maximum likelihood and its associated large-sample theory (cf. Crow 1974; Crowder et al. 1991, pp. 167-174; Karr 1986, pp. 94-124). Bayesian approaches with meaningfully chosen priors would bring improvements to the analysis, because of their ability to incorporate expert knowledge into the inferential mechanism. But the few Bayesian approaches that have been proposed (see, for example, Guida, Calabria, and Pulcini 1989 and Kyparisis and Singpurwalla 1984) are less than satisfactory, because the prior distributions used are chosen more for their mathematical convenience than due to a realistically motivated argument. In this article we describe a Bayesian approach for inference and predictions from a NHPP using a formally developed framework for constructing prior distributions that incorporate expert knowledge and engineering information. Thus our proposed approach should prove superior to both the sample theory methods that do not use expert information and the existing Bayesian approaches that do not better exploit the knowledge that is available. A noteworthy feature of our approach is that the expert opinion is elicited on the mean value function of the process, irrespective of its functional form. This is preferable to eliciting opinion on the unobservable and abstract parameters of the process.

The idea of using expert knowledge and engineering information in a Bayesian analysis of time to failure models is not new (cf. Lindley and Singpurwalla 1986; Singpurwalla 1988). But what is new here is a consideration of this idea to a new class of models, namely the point process models. Thus to summarize, the methodological innovation of this article is the development of a formal mechanism for Bayesian inference in a general Poisson point process model using priors that have been motivated by expert knowledge or engineering information on the mean value function of the process.

To illustrate the applicability of our approach, we consider two scenarios, one from software engineering and the other from the railroad industry. The first scenario pertains to the number of failures in a piece of software during its debugging. The predictive distribution of the number of failures is a necessary input to a model that determines the optimal time for which the software needs to be tested and debugged prior to its release (see, for example, Singpurwalla 1991). We model the failure-generating mechanism via a NHPP indexed over time. In the second scenario our focus of interest is the number of defects in a railroad track as a function of the cumulative load in million gross tons (MGT) that it carries. The number of defects increases with the MGT carried, and the track is replaced when the former becomes excessive. The optimal decision for track replacement is based on an economic analysis that balances costs and risks. An essential input to the economic analysis is the predictive distribution of the number of defects as a function of the MGT. Here we model the number of defects in a track segment via the NHPP indexed by the MGT.

In the two scenarios, expert knowledge or engineering information arises via the following considerations. With regard to the software engineering problem, expert information is provided by the software engineer's empirical knowledge about a particular piece of software. Such knowledge is generally based on the number of lines of code in the software, the number of modules, the data obtained when testing the individual modules, the operational profile, and other factors. …