Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments

Article excerpt


We are concerned here with the prediction of a function y on a domain T, given the function values at a set of "sites" D = {[.sup.(i)] [member of] T, i = 1, ..., n}, which we are at liberty to select. We shall take T to be in [R.sup.k] and y(t) to be in [R.sup.1], although the primary elements of the approach can be described with more general T and y. The motivating application is the design and analysis of computer experiments (Sacks, Welch, Mitchell, and Wynn 1989), where t determines the input to a computer model of a physical or behavioral system and y(t) is a response that is part of the output or is calculated from it. We consider t to be fixed during any given run of the computer model, and we assume the function y(t) is deterministic: If the program is run twice (on the same computer) with the same value of t, the same value of y will result. In this context, the experiment design consists of the sites in D; the experiment itself consists of running the computer model n times, each time with input determined by a different member of D. Knowledge of the n design sites and the corresponding responses [y.sub.1], ..., [y.sub.n] is then used to predict y(t) at any desired t [member of] T. Interest in prediction derives from the fact that complex computer models often require long running times; the number of runs that can be made is therefore limited. We are concerned here with methods of prediction given D and with the choice of D.

Here we use a Bayesian formulation, under which (uncertain) knowledge about the function v is expressed by means of the random function Y. This usage has previously been applied to surface estimation in several contexts, including interpolation and, more recently, image restoration (Geman and Geman 1984; Ripley 1988, chap. 5). Random functions have been studied for a long time under the heading of stochastic processes, and we borrow notation and nomenclature from that source. In particular, we shall refer to the representations of prior and posterior knowledge of v as the prior and posterior processes. The posterior mean of Y given the n-vector of responses [y.sub.D] can be used as the prediction function [^.y]; this is clearly an interpolating function, that is, [^,y]([t.sup.(i)]) = ([t.sup.(i)]), for i = 1, ..., n. Bayesian interpolation has a long history--see Diaconis (1988) for an interesting account of a method published by Poincare in 1896. More recently, Kimeldorf and Wahba (1970a) established the connection between Bayesian interpolation and smoothing splines, and Wahba (1978) provided more general results along the same lines. (See, also, Wahba 1990.)

On another front, the use of random functions to represent knowledge about deterministic functions observed with error is central to Bayesian regression methodology. Originally, the prior process was generated simply by assigning a joint prior distribution to the p coefficients in a standard linear regression model (Raiffa and Schlaifer 1961; Tiao and Zellner 1964; Lindley and Smith 1972). Chaloner (1984) reviewed much of the corresponding work in design of experiments. Because of their finite dimensionality, however, these priors are not well suited to prediction of deterministic functions where there is no observation error. In particular, knowledge of y at p suitably chosen sites is sufficient to predict y at all t with no uncertainty whatever; this seems unrealistic, and leads to obvious difficulties if n > p. We shall not consider finite-dimensional processes further here. Infinite-dimensional processes have been used as Bayesian priors for prediction in regression settings by Blight and Ott (1975) and Wahba (1978); O'Hagen (1978) and Steinberg (1985) used them to develop design criteria as well.

Another large body of work, with a long history and a slightly different philosophy, is based on the view of y as a realization of a stochastic process, that is, Y is taken as a model for y. …