Climatic variables such as annual mean precipitation and temperature display complex and nonlinear variation with latitude, longitude, and elevation. Neural networks are universal approximators and very good at detecting and representing nonlinear relationships between dependent and independent variables. In this paper we use resilient backpropagation (Rprop) neural networks to interpolate annual mean precipitation and temperature surfaces for China. Climate surfaces are interpolated from a total of 288 long-term climate station data points using latitude, longitude, and elevation derived from a 5-kilometer resolution digital elevation model. Initial trials of Rprop suggested very fast learning, insensitivity to selection of learning parameters, and a tendency not to overtrain. Cross-validation was used to determine the best network structure and assess the error inherent in climate interpolation. With the error explicit, the final neurointerpolations of annual mean precipitation and temperature were construc ted using all 288 climate station data points. Maps of residuals are also presented. The neurointerpolation of temperature was very successful and captures most of the regional trends found in established climate maps of China as well as significant cant topographically defined detail. For annual mean temperature the Rprop neural network was found to be an accurate and robust global spatial interpolator. However, the precipitation surface captures only the major latitudinally and continentally defined trends and misses many subregional rainfall features probably because of the influence of other non parameterized atmospheric and topographic factors.
Accurate climate information with a good degree of spatial detail is very important for a wide variety of human activities such as agricultural and land-use planning, and for a range of scientific purposes such as understanding the distribution patterns of wild animals and plants. The detailed study of climate and climate-related phenomena in China goes back thousands of years, but mapping of these patterns is a relatively recent endeavor.
Climatic parameters such as annual mean temperature and precipitation vary complexly in three dimensions (latitude, longitude, and elevation) through the effects of latitude, lapse rates (temperature), continentality, and orography (precipitation). Climatic mapping in the past has often involved Gestalt interpretation by expert meteorologists, climatologists, and cartographers (see Genton and Furrer 1998) using not only climate station data but also general observations on topography, land use, and land cover that are known to be influenced by climate (see Editorial Committee of the Physical Geography of China 1995; Wu 1991).
The accurate and detailed mapping of climatic parameters such as annual mean precipitation and temperature has proven to be a difficult problem. Gestalt interpretations of regional climate surfaces by cartographers and climate experts have provided useful broad approximations of climatic parameters (Genton and Furrer 1998) and capture broad latitudinal and elevation-dominated trends. However, the degree of spatial detail of many of these maps is low as the influence of topography in causing local variation in climate is generalized. In recent years, alternative, quantitative, and complementary approaches to climate mapping have become commonplace using Geographic Information Systems (GIS). Many GIS allow the interpolation of data surfaces from spatially disparate point-based data such as climate station records (Burrough and McDonnell 1998). Databases of other variables that influence climate such as elevation can also be integrated within a GIS and are commonly available. However, spatial interpolation withi n most GIS is currently limited to two independent variables, latitude and longitude (c.f. Mitasova et al. 1995). As a result, two-dimensional spatial interpolation of climate variables offers little improvement in capturing topographically defined spatial climatic detail.
Several numerical techniques including splines and regression have been developed specifically for climatic interpolation, which can incorporate topographic information such as slope and aspect and other climatic covariates with some success. Artificial neural networks have been found to accurately model complex nonlinear functions in a variety of fields and offer significant potential as an alternative numerical technique for the multivariate interpolation of climate data. However, standard backpropagation networks have been found to have a tendency to take an excessively long time to learn, to be sensitive to the selection of learning parameters, and to have a tendency to overtrain. The resilient backpropagation (Rprop) network has been found to be much more robust.
Thus, the aim of this study is to provide a rigorous assessment of the suitability of the neural network for its ability to produce annual mean precipitation and temperature surfaces for China that complement the Gestalt climate mapping, using the Rprop neural network with latitude, longitude, and elevation data. We aim to provide a thorough assessment of the error involved in neurointerpolation with Rprop networks and provide an accessible guide to some of the issues involved in neurointerpolation in the geographical realm.
THE SPATIAL COMPLEXITY OF CLIMATIC PABAMETERS
Climatic parameters vary complexly in three dimensions: latitude, longitude, and topographic elevation. On the regional scale, temperature varies with latitude, exhibiting a strong north-south gradient. Temperature also varies fairly predictably with elevation through a lapse rate (Spreen 1947), but is affected by proximity to the coast which tends to have a moderating effect.
Precipitation also varies with latitude but in a more complex way than temperature because of confounding effects of continentality, air masses, and pressure belts. In a very generalized way, high precipitation occurs at equatorial latitudes and lower precipitation occurs in the outer tropics. Precipitation is also high in temperate latitudes and low again at high latitude. Distance from the coast also affects precipitation because air masses closer to the coast tend to have gained higher moisture content from the sea. Precipitation also varies in a complex manner with elevation; it generally increases with elevation up to the permanent snow line, above which there is little precipitation (Spreen 1947).
TECHNIQUES FOR CLIMATE INTERPOLATION
Significant research attention has been devoted to the accurate interpolation of climatic parameters because of the economic and environmental usefulness of such information. However, accurate spatial interpolation of climatic parameters such as temperature and precipitation is nontrivial.
There are several techniques available for the two-dimensional spatial interpolation of surfaces from point-based data including splines, Kriging, distance weighted averaging, and trend analysis. Interpolators such as those above are commonly included in GIS packages and have been applied to the interpolation of climate surfaces from point-based climate station records (Tomczak 1998; Saveliev, Mucharamova, and Piliugin 1998; Atkinson and Lloyd 1998; Holawe and Dutter 1999; Xia, Winterhalter, and Fabian 1999; Xia et al. 1999a and b; Jones and Thomton 1999; Nalder and Wein 1998; Price et al. 2000; Goodale, Aber, and Ollinger 1998). However, the accuracy of climate information interpolated using latitude and longitude alone is questionable, especially when applied to sparsely sampled, montane regions like China, given the complex distribution of climatic parameters.
The accurate interpolation of climatic parameters has often involved the use of three-dimensional information included in digital elevation models (DEM) so that the effect of factors such as temperature lapse rates and orography can be modelled (Nix 1982, 1986; Hutchinson and Bischof 1983; Hutchinson et al. 1996; Hutchinson 1998a, 1998b; Busby 1991; Running, Nemani, and Hungerford 1987; Phillips, Dolph, and Marks 1992; Boyer 1984; Ishida and Kawashima 1993; Stillman et al. 1996). Indeed, commercial computer software has been developed specifically to interpolate climatic variables with corrections for elevation. The two most prevalent techniques include ANUSPLIN (Hutchinson and Bischof 1983) and the parameter-elevation regressions on independent slopes model (PRISM; Daly, Taylor, and Gibson 1997).
ANUSPLIN uses a thin-plate smoothing spline to interpolate climate variables in three dimensions. Elevation is included either as an independent variable (Hutchinson and Bischof 1983; Hutchinson 1998b) or as an independent covariate (Hutchinson 1991) in the interpolation process. PRISM also uses point data and a digital elevation model to interpolate climatic parameters over complex terrain. PRISM uses a weighted precipitation/elevation regression function calculated from local stations with greater weight given to stations with similar location, elevation, and topographic positioning (Daly, Neilson, and Phillips 1994). Both ANUSPLIN (Busby 1991) and PRISM (Daly, Neilson, and Phillips 1994) have been found to yield plausible and statistically similar results (Stillman et al. 1996).
There is no doubt that interpolation techniques such as splines and regression have been useful for climate surface interpolation especially when elevation data is included in the interpolation (Hutchinson 1998b). However, artificial neural networks (see Fischer 1998) offer an alternative to these techniques for climate surface interpolation in three dimensions.
NEURAL NETWORKS: RPROP VERSUS BACKPROP
Artificial neural networks are a broad family of artificial intelligence techniques akin to nonlinear regression that are capable of learning complex, nonlinear structure within data (Sale 1994). The Rprop algorithm is a type of feed-forward, backpropagation multilayer perceptron which can perform the supervised learning of relationships between dependent and independent variables.
Backpropagation neural networks generally include several layers of neurons or processing units--an input layer (or more correctly, an input array), one or more hidden layers, and an output layer. The layers are connected by a series of weighted links and each unit has an activation function which is usually sigmoidal (Figure 1). Training data are fed forward through the network iteratively. For each sample used to train the network, if the sum of the weights incident upon a unit exceeds the activation function threshold, the unit fires. An output is predicted and the error between the predicted and the actual output is then backpropagated through the network and used to update the link weights [Weitjers and Hoppenbrouwers (1995); see Tveter (1995) for a thorough description]. In this way, relationships between the input and output data are learned and contained within the trained network. Once trained, the network can be used to predict the output of new input data.
Backpropagation is the most commonly used algorithm for supervised learning and learns by trying to minimize the error function using a gradient descent technique. Gradient descent involves the update of weights according to the size of the partial derivative (slope) of the error function (Tveter 1995). However, the gradient descent process in standard backpropagation has inherent problems including sensitivity to the selection of the learning rate. With a small learning rate the network can take a long time to reach convergence. A large learning rate can result in oscillation, thereby preventing further reductions in error [Riedmiller and Braun (1993); see our Figure 2). In addition, backpropagation networks have a tendency to overtrain. One of the most powerful and useful abilities of a neural network is its ability to generalize or, in other words, predict the values of samples independent of those used to train the network (Prechelt 1994). Overtraining involves the network's learning an increasingly compl ex nonlinear relationship between input and output data as training progresses, resulting in reduced ability to generalize (Lawrence, Giles, and Tsoi 1996). Several improvements have been made to the standard backpropagation network, notably resilient backpropagation.
Rprop includes a more efficient learning scheme that tries to eliminate the harmful influence of the size of the partial derivative on the weight update by adapting link weights based on the local error function. Only the sign of the derivative is considered which indicates the direction of the weight update. The size of the update is determined by a weight-specific update value (Riedmiler and Braun 1993; Reidmiller 1994). If the sign of the derivative stays the same after an iteration, the weight update value is increased slightly to speed convergence; if the sign changes, the algorithm has jumped over a local minimum and the update value is slightly decreased [for a technical description see IPVR (1995); Figure 2].
Standard backpropagation generally employs online learning where weights are updated after each training sample. Rprop however, employs a batch (offline) learning strategy where error is calculated and weights updated after all cases have been parsed for each training cycle. The major practical advantages of Rprop over standard backpropagation is that learning occurs much faster and the algorithm is not sensitive to the selection of learning parameters as these parameters are adapted during the learning process (IPVR 1995, Riedmiller and Braun 1993; Reidmiller 1994). Because of the advantages discussed above, Rprop networks have great potential for the robust spatial interpolation of climate surfaces from point data.
NEUROCOMPUTATION TECHNIQUES FOR CLIMATE INTERPOLATION
Radial basis functions (RBF) are the neural network architecture most commonly used in spatial interpolation. RBFs include splines and multiquadratic interpolation and are especially suited to interpolation because they are effective at modelling the local spatial variation in two dimensions using x and y spatial coordinates (Lee, Cho, and Wong 1998).
Feed-forward, backpropagation networks are global interpolators. In other words, trained networks contain a single, nonlinear function that describes the morphology of the surface. Backpropagation networks are also approximate interpolators--output values at the data points are not the same as the original values, they are predicted using the global function contained in the network. In effect, backpropagation networks can be thought of as a type of nonlinear regression (Sarle 2000). A major advantage of backpropagation networks is that they can not only include the x and y coordinates but also include many other influential parameters as independent variables to add dimensionality to the interpolation.
Very few published studies have used feed-forward neural networks for spatial interpolation. Kanevsky et al. (1995, 1997a and b) used neural networks to interpolate the spatial distribution of fallout from the Chernobyl nuclear disaster. Demyanov et al. (1998) extended this work and interpolated climatic data in two dimensions using a combination of a backpropagation neural network and residual kriging. In a related study, Snell, Copal, and Kaufman (2000) used a backpropagation neural network to derive air temperature data by downscaling global circulation models. In this study we use an Rprop neural network to interpolate annual mean precipitation and temperature surfaces for China in three dimensions using latitude, longitude and elevation data.
METHODS AND RESULTS
The study area covers the entirety of mainland China as defined in the Digital Chart of the World database and includes the island of Taiwan. In this study, neural networks were used to predict the dependent variables of precipitation and temperature from the independent variables--latitude, longitude, and elevation. This section describes the data used and the interpolation of temperature and precipitation using the Rprop neural network architecture.
The point-based precipitation and temperature data included a total of 288 climate stations across China (Vose et al. 1992; Tao et al. 1997; Central Weather Bureau 1999), most with more than thirty years recording history (Figure 3). The climate database was assembled from three sources including 205 stations from Vose et al. (1992), 62 stations from Tao et al. (1997), and 21 stations across Taiwan from the Central Weather Bureau (1999). Each of the 288 climate stations had a latitude and longitude reference. This spatial reference was used to import the climate data into ArcView GIS. Climate stations were not evenly distributed across the study area. Eastern Taiwan has tightly clustered climate stations with steep climatic gradients contrasting the sparse coverage across the Tibetan Plateau. Eastern China has a relatively dense and fairly even spread of climate stations.
China has a topographically diverse landscape which affects the distribution of precipitation and temperature through processes described above. Hence, a digital elevation model (DEM) of the topography of China comprised a fundamental component of climatic interpolation. The thirty-arc-second resolution United States Geological Survey (USGS) DEM was used in this study (Figure 3). The DEM was converted to ESRI's grid format, resampled to a 5-kilometer grid cell resolution (796 rows X 977 columns) and projected into Lambert's Conformal Conic projection. Values for elevation for the 288 climate stations were derived from this DEM.
Neural Network Interpolation
The network model used throughout is the Rprop feed-forward, backpropagation neural network as implemented in SNNS 4.1 and described in detail by IPVR (1995). Network weights were initialized to random values between -1 and 1 before training and weights were updated in topological order (IPVR 1995). A logistic sigmoidal activation function [equation (1)] and an identity output function [equation (2)] was used for all hidden and output units in all networks. Data samples were shuffled randomly for input into the networks before each learning cycle. Preliminary ad hoc experimentation supported the established findings of the insensitivity of the Eprop network to variation in initial learning parameters (Riedmiller and Braun 1993; Reidmiller 1994; IPVR 1995). The default learning parameters were found to return good results and hence, were used in all neurointerpolations:
* initial weight update value [[DELTA].sub.0] = 0.1;
* limit for the maximum step size [[DELTA].sub.max] = 50;
* weight decay exponent [alpha] = 4;
[a.sub.j](t) = 1/1+[e.sup.([SIGMA][w.sub.ij][o.sub.i](t)-[[theta].sub.j])] (1)
[a.sub.j](t) = activation of unit j in step t;
[o.sub.i](t) = output of unit i in step t;
j = index for some unit in the network;
i = index of a predecessor of the unit j;
[w.sub.ij] = weight of the link from unit i to unit j;
[[theta].sub.j] = threshold (bias) of unit j; and
[o.sub.j](t) = [a.sub.j](t) (2)
where [o.sub.j](t) = output of unit j in step t.
The neurointerpolations of annual mean temperature and precipitation surfaces for China were performed using the following steps:
1. Preprocess input data for input into the network;
2. Conduct initial trials to assess the number of training cycles required, sensitivity to learning parameters, and the tendency to overtrain using cross-validation and stopped learning;
3. Find the network structure that gives the best overall generalization performance using k-fold cross-validation;
4. Assess the sensivity of the selected network structure to variation in both the initial weights and the input data, and assess the overall generalization performance of the network using k-fold cross-validation;
5. Produce final neurointerpolations of temperature and precipitation using all 288 climate station data points and postprocess the neural network output for input back into the GIS for visualization and assessment of the spatial distribution of temperature, precipitation, and the residuals.
1. Data Preprocessing. In this study the Stuttgart Neural Network Simulator 4.1 (SNNS; IPVR 1995) is used for neural network modelling. Significant data manipulation was required to get the data from GIS format into SNNS and then get the results back into the GIS for presentation. Climate station data was exported from the GIS in ASCII format where each row has information about latitude, longitude, elevation (from the DEM), and either annual mean precipitation or temperature values. For input into the neural network both the input and output data were rescaled to values between 0 and 1 using equation (2) using a purpose-built program called Data Converter written in Microsoft Visual Basic.
[x'.sub.ij] = ([x.sub.ij] - min([x.sub.j]))*(max([x'sub.j]) - min([x'sub.j]))/max([x.sub.j]) - min([x.sub.j]) + min([x'.sub.j]) (3)
[x.sub.ij] = data value to be rescaled in row i, column] of the ASCII GIS database;
min([x.sub.j]) = minimum data value in column j;
max([x.sub.j]) = maximum data value in column j;
min([x'.sub.j]) = new minimum data value for column j;
max([x'.sub.j]) = new maximum data value for column j;
[x'.sub.ij] = rescaled data value in row i, column j such that 0 < = [x'.sub.ij] < = 1.
2. Initial Trials. Initial trials were conducted to determine the approximate number of training cycles for the Rprop network to learn the relationships in the climate data, and to assess the tendency to overtrain. The Rprop neural network topology used in the initial trials was a fully connected four-layer network with one three-unit input layer, two ten-unit hidden layers and one one-unit output layer.
A cross-validation technique was used with stopped learning in the initial trials to assess the learning and overtraining characteristics of the Rprop network. The technique of stopped learning is a way of achieving the best possible generalization from the network and involves training the neural network using a training data set whilst checking the generalization error of the network using a separate validation data set. Network learning is stopped when the generalization error is at it lowest and the overall unbiased generalization error is checked using yet another separate test data set.
The cross-validation in the initial trials involved training networks for temperature and precipitation using five sets of training-validation-test data sets. Each training data set had 200 samples and each validation and test data set had 44 samples selected randomly from the 288-sample climate data set without replacement. The temperature and precipitation networks were trained once with each of the five training data sets using 1,000 cycles and the mean squared error (MSE, or the sum of the squared differences between the actual and predicted climate values, divided by the number of training samples) calculated for the respective validation data set after each training cycle to assess overtraining. MSE was also calculated for the test data set after training to assess overall generalization error.
The Rprop networks were found to learn very fast. All of the temperature and precipitation networks reached convergence in less than 300 cycles and most in less than 100 and none of the networks experienced overtraining in the initial trials. These results agree with other tests of Rprop performance (Riedmiller and Braun 1993) and were very important in guiding further network training in this study. Thus, in subsequent network training it was not necessary to use stopped learning with training-validation-test data sets as overtraining did not occur. All subsequent networks were trained using training data sets and stopped after 300 learning cycles. Validation data sets were not required. Network generalization was then assessed by testing the unbiased, out-of-sample error of an independent test data set.
3. Network Structure. To find the best neural network structure for climatic interpolation of China, k-fold cross-validation was used (see Lawrence, Giles, and Tsoi 1996). In k-fold cross-validation a data set of n cases is split into k pairs of training and test data sets and the neural network is trained k times. The test data are withheld from training and are used to assess the abilities of the neural network to generalize. Each training data set consists of (n -- n/k) samples and each test data set consists of n/k samples. Each sample is used to train the network (k -- 1) times but only once to test the network error (Sarle 2000; see also Baxt and White 1995; Tibshirani 1996). In this study twelve-fold cross-validation was used. This involves splitting the 288-sample climate station data set into twelve pairs of training and test data sets and training the network twelve times. Each training data set consists of 264 samples and each test data set 24 samples.
A total of seventeen network structures were assessed for their ability to successfully generalize climate data using latitude, longitude, and elevation. Each network has three input units (x, y, z) and one output unit (either temperature or precipitation). The networks trialled include single hidden-layer networks with hidden layers ranging from six to sixteen units and networks with two hidden layers each ranging between six and sixteen units (Table 1).
Each of the seventeen network structures was then trained twelve times using each of the twelve pairs of training and test data sets. Each network structure was initialized with randomized weights with the same weights used for each of the twelve trainings. The MSE was calculated for the test data sets. Because it is always a positive number close to zero, the frequency distribution of the test data set MSE displayed a significant positive skew. Therefore, the median MSE (MMSE) and range of the MSE (RMSE) were calculated for each of the seventeen network structures over the twelve test data sets to give a better representation of the central tendency and variability of MSE.
For precipitation, the MMSE was substantially lower when two hidden layers were used (Figure 4); however, the RMSE was similar for all network structures. Thus, whilst the ability of the networks to generalize precipitation increased with two hidden layers, the variability in error did not improve. For temperature, the MMSE and RMSE both decreased significantly with two hidden layers. Thus, the Rprop neural networks were able to more accurately and consistently generalize temperature with two bidden layers. The number of units in each layer tended not to affect generalization ability. The 3_10_10_1 network structure consistently provided good results and was selected for further use in climate interpolation.
4. Network Robustness and Generalization. The selected 3_10_10_1 Rprop neural network structure was then tested for its sensitivity to the selection of initial weights and training data, and its ability to generalize annual mean precipitation and temperature. A 3_10_10_1 network was trained twelve times using the same random initial weights and the generalization error (MSE of test data sets) calculated in a process of twelve-fold cross-validation for both precipitation and temperature. This was repeated ten times each for precipitation and temperature, each time using different initial random weights.
To test the sensitivity of the networks to the initial weights, differences in MSE between the ten different networks trained and tested using the same data sets but initialized using different random initial weights were assessed by analysis of variance (ANOVA) as the values were normally distributed. To test the sensitivity of the networks to input data, differences in MSE between the ten different networks trained and tested on different data sets and initialized using the same random initial weights were assessed using the Kruskal-Wallis rank sum test (nonparametric equivalent of one-way ANOVA) due to positively skewed distributions.
ANOVA revealed that using different random initial weights did not cause significant differences in the MSE for both annual mean precipitation and temperature (P < 0.05). However, the Kruskall-Wallis test revealed a significant difference in MSE when different training and test data sets were used for both precipitation and temperature (P < 0.05).
The assumption of normality was satisfied for the distributions of the MMSE and RMSE values from the twelve-fold cross-validation over the ten repeats. Hence, the mean, standard deviation and 95 percent confidence limits for the population mean were calculated to provide a measure of the overall ability to generalize precipitation and temperature, and the variability of generalization. The square root of these was also calculated and linearly rescaled to reflect real climate units using the minimum and maximum values of the original precipitation and temperature data sets in equation (3) (Table 2).
5. Final Climate Interpolations and Post-Processing Neural Network Output. With the generalization error made explicit, Rprop neural networks were trained to predict climate data using all 288 climate station data points using the 3_10_10_1 Rprop network using the parameters described above. The trained networks were then used to predict annual mean precipitation (Figure 5) and temperature (Figure 6) values for the latitude, longitude, and elevation information for each 5km X 5km grid cell from the DEM of China. The Visual Basic Data Converter was programmed to rescale the output of the neural network back to real climate values and format the output for input into the ArcView GIS using equation (3) and the original climate data ranges. Residuals were calculated within the GIS as the actual climate value for each climate station minus the climate value predicted by the final neurointerpolations. The spatial distribution of residuals for precipitation and temperature are mapped (Figures 7 and 8, respectively).
Rprop neural networks were used successfully to interpolate annual mean precipitation and temperature over the topographically diverse landscape of China. However, the networks were able to predict temperature much more accurately than precipitation. These issues are discussed below.
Precipitation. The neurointerpolated precipitation surface attempts to model the complex spatial distribution of precipitation for China. Significant nonlinear features have been captured by the network. The precipitation surface displays a strong NWSE trend reflecting the broad influences of continentality and latitude on precipitation. The influence of elevation on the distribution of precipitation as predicted by the neural network has a secondary influence to the major NW-SE trend (Figure 5).
On a broad scale the neurointerpolated precipitation surface is plausible and follows generally the patterns in the climate station data. The southeast of China, especially the island of Taiwan, displays higher annual mean precipitation with some topographic variation. This grades through reasonably uniformly to the montane Tibetan plateau and the cold deserts of the northwest. The network does, however, capture an extraordinary amount of topographically defined variation in precipitation over Taiwan which is present in the input climate station data (Figure 5). The input precipitation data for Taiwan includes some very steep rainfall gradients especially in the north of the island. However, the aim of this study was not to detect broad scale patterns but to improve existing maps by capturing increased spatial and topographically defined detail in the distribution of climatic parameters.
Quantification of the generalization error for precipitation (mean MMSE-218.6 mm/yr) suggests that the accuracy of the neurointerpolated precipitation surface is only moderate for areas that are not sampled by a climate station and that it is fairly variable (mean RMSE-551 mm/yr). Visualization of the spatial distribution of residuals suggests that errors in interpolation are spatially autocorrelated and, hence, systematic rather than random. The largest residuals occur over Taiwan and extend over to the mainland from the island. It is suspected that the steep precipitation gradients in this area contributed to the large residuals. It is suspected that these gradients had an appreciable and deleterious effect on the neurointerpolation of precipitation. Precipitation was underestimated around the center of the China (around the 30N parallel) and in the northeast (Figure 7). In the area between 30N-45N, precipitation is overestimated for many climate stations; however, several underestimated stations also occur in this area.
Thus, whilst the neurointerpolator detected broad-scale precipitation trends, much of the fine-scale topographically defined detail was missed. Comparison with published precipitation maps for China (Editorial Committee of the Physical Geography of China 1995; Arakawa 1969) reveals that significant regional precipitation features are not present in the neurointerpolations. These omitted features tend to coincide with areas of larger residuals. Bryan and Adams (2001) provide a detailed comparison of the neurointerpolations with published climate maps.
Temperature. The neurointerpolated annual mean temperature surface includes complex nonlinear spatial variation in temperature over China (Figure 6). This captures the major latitudinal, topographic, and continental trends. In addition, some very fine spatial, topographically defined detail is also evident. In particular, the detail present for Taiwan and southern China is remarkable as temperature increases follow closely the steep elevation gradient in these areas. The complex interactions of elevation and latitude upon temperature is well represented in the major river valleys in the southeastern Tibetan plateau. In addition, to the northwest the temperatures of the Tarim Basin are also accurately represented in the neurointerpolation and exhibit much finer spatial detail compared to published maps. Bryan and Adams (2001) provide a more detailed comparison of the neurointerpolated temperature surface with published maps.
The ability of the neural network to generalize temperature is much greater than that for precipitation. Assessment of the mean MMSE (1.12 [degrees]C) suggests that the Rprop neurointerpolator can accurately predict annual mean temperature for areas that are not sampled by climate stations. This generalization ability is also fairly consistent (mean RMSE -- 1.68 [degrees]C). Although the temperature residuals are much less than those for precipitation, they also appear to be spatially autocorrelated (Figure 8) suggesting systematic errors in interpolation. Temperature values for the coastal areas of Taiwan, central China, and the area around 40N, 120E were underestimated. Conversely, temperature values in the area between 25-30N and 110-115E were slightly overestimated. In other areas temperature residuals appear to be fairly randomly distributed.
The Rprop neural network was able to accurately predict the complex spatial distribution of annual mean temperature over the topographically complex landscape of China. The neurointerpolated surface also displays good agreement with published maps of annual mean temperature (see Editorial Committee of the Physical Geography of China 1995; Arakawa 1969; Bryan and Adams 2001).
Success of Rprop Climate Neurointerpolation. The Rprop neural network architecture performed significantly better at interpolating temperature than precipitation. We suggest two reasons for this. First, the independent variables-latitude, longitude, and elevation-may have a more direct effect on temperature variation than on precipitation. Temperature is known to vary with elevation fairly consistently via a lapse rate whereas precipitation varies complexly with elevation as discussed above. Second, we suspect that there is a significant effect of unparameterized variables such as aspect, orography, and rainshadows on the distribution of precipitation. Clearly, the ability of any model to capture the relationship between dependent and independent variables is dependent upon the predictive ability of the independent variables. It is suspected that improved ability to interpolate precipitation may result from the inclusion of some of these unparameterized variables in a higher-dimensional interpolation. Alterna tively, precipitation may be better modelled using local interpolators such as a spline or RBF network.
Rprop as a Spatial Interpolator
Training. The Rprop neural network architecture has potential for the spatial interpolation of complex phenomena. A major advantage of the Rprop network is that it learns very fast. The Rprop network was also resilient against overtraining. This characteristic greatly enhances useability of the network architecture because the user does not have to employ early stopping to cease training at the optimum number of cycles before the ability of the network to generalize decreases. Resilience against overtraining enables to user to use a constant number of learning cycles with confidence, and together with fast learning enables the user to perform numerous training runs and to conduct extensive error testing such as k-fold cross-validation.
The lack of sensitivity to user-specified learning parameters is another major advantage. This reduces the potential for neural network training to turn into a tedious search in parameter space for suitable combinations. Choice of the three learning parameters necessary to run Rprop in SNNS is not critical. The initial update value is adapted as learning proceeds, the maximum weight step is an arbitrary maximum, and the weight decay exponent is also fairly flexible (IPVR 1995). These characteristics reduce greatly the number of variables involved in Rprop neural network training and free up the user to focus on rigorous evaluation of error.
Network Structure, Robustness and Error The best network structure for climate interpolation was assessed using the MSE for seventeen different network structures. The appreciable improvement in error achieved by including a second hidden layer (Figure 4) is probably a result of the increased capacity of the network to model nonlinearity. Little further improvement was made by increasing the number of units in the hidden layers. A rule of thumb for choosing network structure may be simply to use one hidden layer for modelling features with less complex spatial variation and two hidden layers for features with more complex variation such as climate.
Analysis of the MSE using ten repeats of twelve-fold cross-validation revealed that the Rprop neural network is robust to variation in random initial weights. This characteristic also reduces the amount of parameters to control when using the Rprop neural network in spatial interpolation. The Rprop network was, however, sensitive to the selection of samples in the training and test data sets for both precipitation and temperature. This means that certain combinations of samples used in training provide better generalization ability than others. The effects of this on interpolation are unclear and should be the topic of future research. The process of k-fold cross-validation used in this study provides a useful and rigorous way to quantify and make explicit the error involved in neurointerpolation.
Climate variables such as annual mean precipitation and temperature often exhibit complex spatial distribution over large and topographically complex areas such as China. Several different techniques have been used to try to reliably interpolate this distribution. This study assessed the potential of the Rprop neural network architecture as a spatial interpolator by applying it to the difficult problem of climate surface interpolation.
The Rprop neurointerpolator very successfully modelled the complex, nonlinear relationship between latitude, longitude, elevation, and annual mean temperature for climate station data points in China. However, the neurointerpolator was not able to successfully model or predict the distribution of annual mean precipitation across China to the same degree of accuracy. Rather, many subregional precipitation features were omitted and only broad latitudinally and continentally defined trends were captured. It is suggested that this is because of the influence of unparameterized topographic and atmospheric variables on precipitation patterns.
Resilient propagation was found to have potential as a robust spatial interpolator. In agreement with previous work and as its name suggests, Rprop was found to be resilient to the selection of learning parameters and initial weights, and learned relationships in data very fast without overtraining. However, the Rprop network was found to be sensitive to the selection of training and validation data, the effects of which on interpolation are unclear.
[FIGURE 1 OMITTED]
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
[FIGURE 5 OMITTED]
[FIGURE 6 OMITTED]
[FIGURE 7 OMITTED]
[FIGURE 8 OMITTED]
TABLE 1 The Structure of the Seventeen Neural Networks Assessed for Their Ability to Generalize Climate Data for China Network Structure # Hidden Layers # Units in each Total # Units Hidden Layer 3_6_1 1 6 10 3_8_1 1 8 12 3_10_1 1 10 14 3_12_1 1 12 16 3_14_1 1 14 18 3_16_1 1 16 20 3_6_6_1 2 6 16 3_7_7_1 2 7 18 3_8_8_1 2 8 20 3_9_9_1 2 9 22 3_10_10_1 2 10 24 3_11_11_1 2 11 26 3_12_12_1 2 12 28 3_13_13_1 2 13 30 3_14_14_1 2 14 32 3_15_15_1 2 15 34 3_16_16_1 2 16 36 Notes: Each network has a single input layer with three units (latitude, longitude, elevation) and a single output layer with one unit (precipition/temperature). TABLE 2 Generalization Error Summary for the 3_10_10_1 Rprop Neural Network Used to Interpolate Annual Mean Precipitation and Temperature Climate Variable Mean MMSE SD MMSE LCL MMSE UCL MMSE Mean RMSE Prec. 0.00198 0.00020 0.00185 0.00210 0.01258 Temp. 0.00142 0.00019 0.00131 0.00154 0.00317 Prec. (mm/yr) 218.6 69.5 211.3 225.1 551.0 Temp. ([degrees]C) 1.124 0.411 1.080 1.171 1.680 Climate Variable SD RMSE LCL RMSE UCL RMSE Prec. 0.00079 0.01209 0.01307 Temp. 0.00097 0.00257 0,00377 Prec. (mm/yr) 138.1 540.2 561.6 Temp. ([degrees]C) 0.929 1.513 1.832 NOTES: The top two rows present the mean, standard deviation, lower and unpper 95 percent confidence intervals of both the median and the range of mean squared error for both precipitation and temperature. The bottom two rows are the descriptive statistics in the top two rows, squre rooted and rescaled to real preciption and temperature values using the original minimum and maximum climate values in equation (3).
Geographical Analysis, Vol. 34, No. 2 (April 2002) The Ohio State University Submitted: 1/4/01. Revised version accepted: 9/24/01
Arakawa, H. (1969). Climates of Northern and Eastern Asia. Amsterdam: Elsevier.
Atkinson, P. M., and C. D. Lloyd (1998). "Mapping Precipitation in Switzerland with Ordinary and Indicator Kriging." Journal of Ceographic information and Decision Analysis 2(2), 72-86.
Baxt, W. G., and H. White (1995). "Bootstrapping Confidence Intervals for Clinical Input Variable Effects in a Network Trained to Identify the Presence of Acute Myocardial Infarction." Neural Computation 7, 624-38.
Boyer, D. G. (1984). "Estimation of Daily Temperature Means Using Elevation and Latitude in Mountainous Terrain." Water Resources Bulletin 4, 583-88.
Bryan, B. A., and J. M. Adams. "Quantitative and Qualitative Assessment of the Accuracy of Neurointerpolated Annual Mean Precipitation and Temperature Surfaces for China." Cartography 30:2.
Burrough, P., and Rachael McDonnell (1998). Principles of Geographical Information Systems. Oxford University Press.
Busby, J. R. (1991). "BIOCLIM--A Bioclimate Analysis and Prediction System." In Nature Conservation: Cost Effective Biological Surveys and Data Analysis, edited by C. R. Margules and M. P. Austin, pp. 64-68. CSIRO: Australia.
Central Weather Bureau (1999). "Climatological Data Annual Report 1999, Weather Station of Central Weather Bureau, Taiwan." A report by the Central Weather Bureau of Taiwan.
Daly, C., R. P. Neilson, and D. L. Phillips (1994). "A Statistical-Topographic Model for Mapping Climatological Precipitation over Mountainous Terrain." Journal of Applied Meteorology 33, 140-58.
Daly, C., G. Taylor, and W. Gibson (1997). "The PRISM Approach to Mapping Precipitation and Temperature." In 10th Conference on Applied Climatology, Reno, NV. American Meteorological Society, pp.10-12.
Demyanov, V., M. Kanevski, S. Chernov, E. Saveliev, and V. Timonin (1998). "Neural Network Residual Kriging Application for Climatic Data." Journal of Geographic Information and Decision Analysis 2(2), 234-52.
Editorial Committee of the Physical Geography of China (1995). "Annual Mean Precipitation and Temperature Maps for China."
Fischer, M. M. (1998). "Computational Neural Networks: A New Paradigm For Spatial Analysis." Environment and Planning A 30(10), 1873-91.
Genton, M. G., and R. Furrer (1998). "Analysis of Rainfall Data by Simple Good Sense: Is Spatial Statistics Worth the Trouble?" Journal of Geographic Information and Decision Analysis 2(2), 234-52.
Goodale, C. L., J. D. Aber, and S. V. Ollinger (1998). "Mapping Monthly Precipitation, Temperature, and Solar Radiation for Ireland with Polynomial Regression and a Digital Elevation Model." Climate Research 10(1), 35-49.
Holawe, F., and R. Dutter (1999). "Geostatistical Study of Precipitation Series in Austria: Time and Space." Journal of Hydrology 219(1-2), 70-82.
Hutchinson, M. F. (1991). "The Application of Thin Plate Smoothing Splines to Continent-wide Data Assimilation." In BMRC Research Report No.27, Data Assimilation Systems, edited by J. D. Jasper, pp. 104-13. Melbourne: Bureau of Meteorology.
_____ (1998a). "Interpolation of Rainfall Data with Thin Plate Smoothing Splines: I. Two Dimensional Smoothing of Data with Short-Range Correlation." Journal of Geographic Information and Decision Analysis 2(2), 152-67.
_____ (1998b). "Interpolation of Rainfall Data with Thin Plate Smoothing Splines: II. Analysis of Topographic Dependence." Journal of Geographic Information and Decision Analysis 2(2), 168-85.
Hutchinson, M. F., and R. J. Bischof (1983). "A New Method for Estimating the Spatial Distribution of Mean Seasonal and Annual Rainfall Applied to the Hunter Valley, New South Wales." Australian Meteorological Magazine 31(3), 179-84.
Hutchinson, M. F., H. A. Nix, J. P. McMahon, and K. D. Ord (1996). "The Development of a Topographic and Climate Database for Africa." In Proceedings of the Third International Conference/Workshop on Integrating GIS and Environmental Modeling. Santa Barbara, Calif.: NCGIA.
IPVR (1995). "SNNS: Stuttgart Neural Network Simulator, User Manual Version 4.1." Institute for Parallel and High Performance Systems, University of Stuttgart. http://www-ra.informatik.unituebingen.de/SNNS/
Ishida, T., and S. Kawashima (1993). "Use of Cokriging to Estimate Surface Air Temperature from Elevation." Theoretical and Applied Climatology 47, 147-57.
Jones, P. G., and P. K. Thornton (1999). "Fitting a Third-Order Markov Rainfall Model to Interpolated Climate Surfaces." Agricultural and Forest Meteorology 97(3), 213-31.
Kanevsky, M., R. Arutyunyan, L. Bolshov, V. Demyanov, and M. Maignan, M. (1995). "Artificial Neural Networks and Spatial Estimations of Chernobyl Fallout." Geoinformatics 7(1-2), 5-11.
Kanevsky, M., R. Arntyunyan, L. Bolshov, V. Demyanov, S. Chernov, I. Linge, N. Koptelova, E. Savelieva, T. Haas, and M. Maignan (1997a). "Chernobyl Fallouts: Review of Advanced Spatial Data Analysis." In geo ENV I--Geostatistics for Environmental Applications, edited by A. Soares, J. Gomez-Hernandes, and R. Froidvaux', pp. 389-400. Kluwer Academic Publishers.
Kanevsky, M., V. Demyanov, and M. Maignan (1997b). "Spatial Estimations and Simulations of Environmental Data by Using Geostatistics and Artificial Neural Networks." In IAMG'97 Proceedings of The Third Annual Conference of the International Association for Mathematical Geology, Barcelona, Spain, CIMNE, vol. 2, edited by V. Pawlowsky, p. 527.
Lawrence, S., C. L. Giles, and A. C. Tsoi (1996). "What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation." Technical Report UMIACS-TR-96-22 and CS-TR3617. Institute for Advanced Computer Studies, University of Maryland.
Lee, S., S. Cho, and P. M. Wong (1998). "Rainfall Prediction Using Artificial Neural Networks." Journal of Geographic information and Decision Analysis 2(2), 253-64.
Mitasova, H., L. Mitas, W H. Brown, D. P. Gerdes, I. Kosinovsky, and T. Baker (1995). "Modelling Spatially and Temporally Distributed Phenomena: New Methods for GRASS GIS." International Journal of Geographic Information Systems 9(4), 433-46.
Nalder, I. A., and R. W. Wein (1998). "Spatial Interpolation of Climatic Normals: Test of a New Method in the Canadian Boreal Forest." Agricultural and Forest Meteorology 92(4), 211-25.
Nix, H. A. (1982). "Environmental Determinants of Biogeography and Evolution in Terra Australis." In Evolution of the Flora and Fauna of Arid Australia, edited by W.R. Barker and P.J.M. Greenslade, pp. 47-66. Peacock Publications.
_____ (1986). "A Biogeographic Analysis of the Australian Elapid Snakes." In Atlas of Elapid Snakes 7, edited by R. Longmore, pp. 4-15. Canberra: Australian Government Publishing Service.
Phillips, D. L., J. Dolph, and D. Marks (1992). "A Comparison of Geostatistical Procedures for Spatial Analysis of Precipitation in Mountainous Terrain." Agricultural and Forest Meterology 58, 119-41.
Prechelt, L. (1994). "Proben1--A Set of Neural Network Benchmark Problems and Benchmarking Rules." Technical report, University of Karlsruhe, Germany.
Price, D. T., D. W. McKenney, I. A. Nalder, M. F. Hutchinson, and J. L. Kesteven (2000). "A Comparison of Two Statistical Methods for Spatial Interpolation of Canadian Monthly Mean Climate Data." Agricultural and Forest Meteorology 101 (2-3), 81-94.
Riedmiller, M. (1994). "Advanced Supervised Learning in Multilayer Perceptrons: From Backpropagation to Adaptive Learning Techniques." International Journal of Computer Standards and Interfaces 16.
Riedmiller, M., and H. Braun (1993). "A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm." Proceedings of the IEEE International Conference on Neural Networks 1993. San Francisco: IEEE.
Running, S. W., R. R. Nemani, and R. D. Hungerford (1987). "Extrapolation of Synoptic Meteorolgoical Data in Mountainous Terrain and Its Use Simulating Forest Evapotranspiration Rate and Photosynthesis." Canadian Journal of Forest Research 17, 472-83.
Sarle, W. S. (1994), "Neural Networks and Statistical Models." In Proceedings of the 19th Annual SAS Users Group International Conference.
_____ (2000). "AI-FAQ/Neural-Nets." ftp://ftp.sas.com/pub/neural/FAQ.html
Saveliev, A., S. S. Mucharamova, and C. A. Piliugin (1998). "Modeling of the Daily Rainfall Values Using Surface under Tension and Kriging." Journal of Geographic Information and Decision Analysis 2(2), 58-71.
Snell, S. E., S. Gopal, and R. K. Kaufmann (2000). "Spatial Interpolation of Surface Air Temperatures Using Artificial Neural Networks: Evaluating Their Use for Downscaling GCMs." Journal of Climate 13(5), 886-95.
Spreen, W. C. (1947). "A Determination of the Effect of Topography upon Precipitation." Transactions of the American Geophysicists Union 28, 285-90.
Stillman, S. T., J. P. Wilson, C. Daly, M., Hutchinson, and P. E. Thornton (1996). "Comparison of ANUSPLIN, MTCLIM-3D, and PRISM Precipitation Estimates." In Third International Conference workshop on Integrating GIS and Environmental Modeling CD-ROM. January 21-25, 1996, Santa Fe, New Mexico, USA.
Tao, S., C. Fu, Z. Zeng, and Q. Zhang (1997). "Two Long-Term Instrumental Climatic Data Bases of the People's Republic of China." Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing, China. http://cdiac.esd.ornl.gov/epubs/ndp/ndp039/ndp039.html
Tibshirani, R. (1996). "A Comparison of Some Error Estimates for Neural Network Models." Neural Computation 8, 152-63.
Tomczak, M. (1998). "Spatial Interpolation and Its Uncertainty Using Automated Anisotropic Inverse Distance Weighing (IDW): Cross-Validation/Jackknife Approach." Journal of Geographic Information and Decision Analysis 2(2), 18-33.
Tveter, D. (1995). "The Backprop Algorithm. Backpropagator's Review." http://www.dontveter.com/bpr/bpr.html
Vose, R. S., R. L. Schmoyer, P. M. Steurer, T. C. Peterson, R. Heim, T. R. Karl, and J. K. Eischeid (1992). "The Global Historical Climatology Network: Long-Term Monthly Temperature, Precipitation, Sea Level Pressure, and Station Pressure Data." http://cdiac.esd.ornl.gov/epubs/ndp/ndp041/ndp041.html
Weitjers, A.J.M.M., and G.A.J. Hoppenbrouwers (1995). "Backpropagation Networks for Grapheme-Phoneme Conversion: A Nontechnical Introduction." In Artificial Neural Networks: An introduction to ANN Theory and Practice. Lecture Notes in Computer Science, vol. 931, edited by P.J. Braspenning, F. Thuijsman, and A.J.M.M. Weitjers, pp. 11-36.
Wu, C. (1991). "Land Use Map of China." Institute of Geography, Chinese Academy of Sciences. Surveying and Mapping Publishing House, Beijing, China.
Xia, Y. L., M. Winterhalter, and P. Fabian (1999a). "A Model to Interpolate Monthly Mean Climatological Data at Bavarian Forest Climate Stations." Theoretical and Applied Climatology 64(1-2), 27-38.
Xia, Y. L., P. Fabian, A. Stohl, and M. Winterhalter, (1999b). "Forest Climatology: Reconstruction of Mean Climatological Data for Bavaria, Germany." Agricultural and Forest Meteorology 96(1-3), 117-29.
The authors gratefully acknowledge the assistance of Dale Kaiser at Oak Ridge National Laboratories and Ping-Mei Liew of the Department of Meteorology, National Taiwan University with the climate data. The authors are also grateful to Li Xiubin of the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, for supply of the vegetation data.
Brett A. Bryan is a lecturer and education coordinator at GISCA, a National Centre for Social Applications of GIS, University of Adelaide. Jonathan M. Adams is in the Department of Geographical and Environmental Studies, University of Adelaide. E-mail: firstname.lastname@example.org and email@example.com…