Neural Network Hedonic Pricing Models in Mass Real Estate Appraisal
Peterson, Steven, Flanagan, Albert B., The Journal of Real Estate Research
Using a large sample of 46,467 residential properties spanning 1999-2005, we demonstrate using matched pairs that, relative to linear hedonic pricing models, artificial neural networks (ANN) generate significantly lower dollar pricing errors, have greater pricing precision out-of-sample, and extrapolate better from more volatile pricing environments. While a single layer ANN is functionally equivalent to OLS, multiple layered ANNs are capable of modeling complex nonlinearities. Moreover, because parameter estimation in ANN does not depend on the rank of the regressor matrix, ANN is better suited to hedonic models that typically utilize large numbers of dummy variables.
Relative illiquidity, low turnover, and irregularly timed (or absent) cash flows confound the application of standard asset pricing models to real estate. In general, non-exchange traded assets such as private residential real estate are characterized by a lack of fundamentals and, thus, valuation is less a function of discounted present value than one of finding recently traded assets of comparable value.
It is the absence of asset fundamentals that gives rise to hedonic valuation models that extrapolate from means in large samples. These models essentially predict value by projecting a sample of known market values on their respective property characteristics (such as heated area, square footage, age, acreage), using the estimated parameters in conjunction with a vector of characteristics for a property of unknown value to imply a price. Large-scale implementation of linear hedonic models can be found, for instance, in automated valuation systems adopted in the mortgage finance industry (Rossini, Kershaw, and Kooymans, 1992, 1993; Detweiler and Radigan, 1996; and Baen and Guttery, 1997). Because these models are easy to specify, estimate, and extrapolate, they tend to be popular to end-users.
Nevertheless, usefulness depends on how well models minimize pricing errors. Pricing errors are responsible in part for denial of credit to otherwise credit worthy parties resulting in type I error and goodwill loss, as well as extension of credit to parties with underestimated risk-a form of type II error (Shiller and Weiss, 1999). Hedonic models are exposed to pricing errors simply because they extrapolate means from large samples and, as such, will always be exposed to sampling error. Specification error is also unavoidable in ad hoc specifications and, to the extent that value does not map linearly onto property characteristics, so too are errors due to neglected nonlinearities.
To the extent that nonlinear models nest linear forms, then nonlinear models would be the preferred choice. However, the exact nonlinear form is neither apparent nor are there necessarily practical steps one could take to find the correct form. Artificial neural networks (ANN) do, however, provide a practical alternative to conventional least squares forms (including nonlinear least squares) that is easily implementable and which efficiently models nonlinearities in the underlying relationships (including the parameters).
We argue that neural networks are more robust to model misspecification and especially to various peculiarities in how various explanatory variables are measured. Hedonic models rely heavily on property attributes and therefore many of the explanatory variables are categoricals or counts. Categoricals, such as location, zoning, or type of construction material, have no ordinal rankings and therefore do not belong in the regression function except in the form of dummy variables. The regressor matrix is often dominated by dummies and, as we show below, this increases the likelihood of rank failure. Linear models deal with this problem by aggregating cases within categories to produce fewer dummies but at a cost of discarding useful information that helps discriminate between properties. Still, other variables, such as number of baths or story height, while ordinal, are quite limited in their ranges. …