Geography, Spatial Data Analysis, and Geostatistics: An Overview

Article excerpt

Geostatistics is a distinctive methodology within the field of spatial statistics. In the past, it has been linked to particular problems (e.g., spatial interpolation by kriging) and types of spatial data (attributes defined on continuous space). It has been used more by physical than human geographers because of the nature of their types of data. The approach taken by geostatisticians has several features that distinguish it from the methods typically used by human geographers for analyzing spatial variation associated with regional data, and we discuss these. Geostatisticians attach much importance to estimating and modeling the variogram to explore and analyze spatial variation because of the insight it provides. This article identifies the benefits of geostatistics, reviews its uses, and examines some of the recent developments that make it valuable for the analysis of data on areal supports across a wide range of problems.

Introduction

As an introduction to this special issue, the purpose of this article is to provide an overview of the core concepts and techniques of geostatistics, together with a short literature review of its application in the environmental sciences and in geography. Geostatistics has a long history of application in the environmental sciences where data are on a point or small regular area support, but it is now being applied to regional data where data are on an areal support that might be large and regular or irregular. We describe the new tools associated with the latter type of data and contrast them with techniques of spatial data analysis with which geographers, especially human geographers, are familiar. These techniques have descended more or less directly from work that began in the 1960s by Dacey (1968) and Cliff and Ord (1969), also the subject of a recent special issue of Geographical Analysis (2009, issue 4). Geostatistics, by contrast, has a different lineage and uses a different set of tools and techniques.

The use of the term spatial analysis in geography can be traced back to the 1950s (see, e.g., Berry and Marble 1968). It includes several distinctive elements (Haining 2003, pp. 4-5), but the statistical analysis of spatial data is the focus here, referred to by statisticians as spatial statistics (Ripley 1981) or statistics for spatial data (Cressie 1993). Geographers often refer to these as methods for spatial data analysis (Haining 1993), and many of these models and techniques figure prominently in geographic information science (Goodchild and Haining 2004) and spatial econometrics (Anselin 1988).

The roots of spatial statistics can be traced back to the early part of the twentieth century to analyses of agricultural field trial data by statisticians. Geostatistics is a component of spatial statistics, although its evolution has been led principally by applied scientists and mathematicians rather than by classically trained statisticians. This historical context may explain why little cross-fertilization occurred with other branches of spatial statistics until quite recently (Cressie 1993; Diggle and Ribeiro 2007) and why geostatistics is distinctive.

Any methodology for analyzing spatial data needs to recognize that such data have the fundamental property of spatial dependence or spatial autocorrelation. For many attributes, values recorded at locations close together in space are correlated (autocorrelated); as the separating distance increases, autocorrelation weakens. (1) The autocorrelation structure in a region may be complex, with several scales of variation nested within, or superimposed on, one another, varying with direction (anisotropic) and between subareas (spatially heterogeneous). Quantifying spatial dependence matters, whether the purpose of an analysis is to interpolate, to fit a regression model, or to test a hypothesis (Haining 2003, pp.33-36, 40-41). Different branches of spatial statistics model spatial dependence in different ways. …