Reconstructing Population Density Surfaces from Areal Data: A Comparison of Tobler's Pycnophylactic Interpolation Method and Area-to-Point Kriging
Yoo, Eun-Hye, Kyriakidis, Phaedon C., Tobler, Waldo, Geographical Analysis
We compare Tobler's pycnophylactic interpolation method with the geostatistical approach of area-to-point kriging for distributing population data collected by areal unit in 18 census tracts in Ann Arbor for 1970 to reconstruct a population density-surface. In both methods, (1) the areal data are reproduced when the predicted population density is upscaled; (2) physical boundary conditions are accounted for, if they exist; and (3) inequality constraints, such as the requirement of non-negative point predictions, are satisfied. The results show that when a certain variogram model, that is, the de Wijsian model corresponding to the free-space Green's function of Laplace's equation, is used in the geostatistical approach under the same boundary condition and constraints with Tobler's approach, the predicted population density surfaces are almost identical (up to numerical errors and discretization discrepancies). The implications of these findings are twofold: (1) multiple attribute surfaces can be constructed from areal data using the geostatistical approach, depending on the particular point variogram model adopted--that variogram model need not be the one associated with Tobler's solution and (2) it is the analyst's responsibility to justify whether the smoothness criterion employed in Tobler's approach is relevant to the particular application at hand. A notable advantage of the geostatistical approach over Tobler's is that it allows reporting the uncertainty or reliability of the interpolated values, with critical implications for uncertainty propagation in spatial analysis operations.
Population data are used extensively in decision-making processes in a wide range of social and economic applications, including but not limited to housing, regional policy, and health provision (Bracken 1994). Despite the increasing availability of socioeconomic data at the very high spatial resolution required in urban analysis and modeling, population microdata are still either suppressed or incomplete due to confidentiality and the high cost of data collection (Ryan, Maoh, and Kanaroglou 2009). Perhaps the commonly available population data are the summary statistics of population data reported and mapped in irregular geographic areas (e.g., census tracts and enumeration districts) and often need to be transformed into a spatial unit compatible with other data sources (Thurstain-Goodwin and Unwin 2000). A particular case of such a transformation or estimation is the construction of a continuous population density surface, which is often favored in the literature (Tobler 1975, 1979; Goodchild, Anselin, and Deichmann 1993; Langford and Unwin 1994; Martin, Tate, and Langford 2000; Thurstain-Goodwin and Unwin 2000).
Starting from the conventional approach based on choropleth mapping, various areal interpolation methods have been used to construct a density surface from population data. Kernel smoothing, for example, produces a smooth surface that is free of either abrupt discontinuities along boundaries or the strict assumption of within-area homogeneity of source data (Martin 1989, 1996). In this approach, however, the support differences between the source data and the prediction surface are not properly taken into account, because the areal data are collapsed into their corresponding representative points (e.g., polygon centroids). This transformation implicitly assumes that the areal unit or support of the population data is identical to that of the target surface, that is, a point.
In demographic applications of areal interpolation, the absence of empirical data concerning the actual distribution of population at the target spatial resolution is a major obstacle for evaluating any model of population density surface (Martin, Tate, and Langford 2000). Owing to a lack of point-level data, original data reproduction--whether a surface model reproduces the original areal data when the predicted population surface is reaggregated over the spatial units used to collect population data--becomes an essential requirement for accurate interpolation (Tobler 1979; Lam 1983). …