Neural Network Hedonic Pricing Models in Mass Real Estate Appraisal
Peterson, Steven, Flanagan, Albert B., The Journal of Real Estate Research
Using a large sample of 46,467 residential properties spanning 1999-2005, we demonstrate using matched pairs that, relative to linear hedonic pricing models, artificial neural networks (ANN) generate significantly lower dollar pricing errors, have greater pricing precision out-of-sample, and extrapolate better from more volatile pricing environments. While a single layer ANN is functionally equivalent to OLS, multiple layered ANNs are capable of modeling complex nonlinearities. Moreover, because parameter estimation in ANN does not depend on the rank of the regressor matrix, ANN is better suited to hedonic models that typically utilize large numbers of dummy variables.
Relative illiquidity, low turnover, and irregularly timed (or absent) cash flows confound the application of standard asset pricing models to real estate. In general, non-exchange traded assets such as private residential real estate are characterized by a lack of fundamentals and, thus, valuation is less a function of discounted present value than one of finding recently traded assets of comparable value.
It is the absence of asset fundamentals that gives rise to hedonic valuation models that extrapolate from means in large samples. These models essentially predict value by projecting a sample of known market values on their respective property characteristics (such as heated area, square footage, age, acreage), using the estimated parameters in conjunction with a vector of characteristics for a property of unknown value to imply a price. Large-scale implementation of linear hedonic models can be found, for instance, in automated valuation systems adopted in the mortgage finance industry (Rossini, Kershaw, and Kooymans, 1992, 1993; Detweiler and Radigan, 1996; and Baen and Guttery, 1997). Because these models are easy to specify, estimate, and extrapolate, they tend to be popular to end-users.
Nevertheless, usefulness depends on how well models minimize pricing errors. Pricing errors are responsible in part for denial of credit to otherwise credit worthy parties resulting in type I error and goodwill loss, as well as extension of credit to parties with underestimated risk-a form of type II error (Shiller and Weiss, 1999). Hedonic models are exposed to pricing errors simply because they extrapolate means from large samples and, as such, will always be exposed to sampling error. Specification error is also unavoidable in ad hoc specifications and, to the extent that value does not map linearly onto property characteristics, so too are errors due to neglected nonlinearities.
To the extent that nonlinear models nest linear forms, then nonlinear models would be the preferred choice. However, the exact nonlinear form is neither apparent nor are there necessarily practical steps one could take to find the correct form. Artificial neural networks (ANN) do, however, provide a practical alternative to conventional least squares forms (including nonlinear least squares) that is easily implementable and which efficiently models nonlinearities in the underlying relationships (including the parameters).
We argue that neural networks are more robust to model misspecification and especially to various peculiarities in how various explanatory variables are measured. Hedonic models rely heavily on property attributes and therefore many of the explanatory variables are categoricals or counts. Categoricals, such as location, zoning, or type of construction material, have no ordinal rankings and therefore do not belong in the regression function except in the form of dummy variables. The regressor matrix is often dominated by dummies and, as we show below, this increases the likelihood of rank failure. Linear models deal with this problem by aggregating cases within categories to produce fewer dummies but at a cost of discarding useful information that helps discriminate between properties. Still, other variables, such as number of baths or story height, while ordinal, are quite limited in their ranges. Nevertheless, these are often incorporated (perhaps with their squares as proxies for nonlinearities) as if they were continuously measured regressors.1 In fact, the model studied in this paper is restricted in the number and type of usable explanatory variables because OLS estimates were impossible to estimate due to rank failure in the matrix of regressors-a problem not inherent in feed forward networks, which do not require inverting the matrix of inputs. We will comment further on this point below.2
Despite this, mass appraisal and automated valuation systems tend to rely on linear models partly due to convenience and to some degree because the costs of pricing errors are not fully understood. In this paper, we show, using a sample of 46,467 residential property sales from 1999 to 2005, that convenience can be expensive. We illustrate using matched pair t-tests of property valuations from ANN against linear hedonic pricing models, that the latter generate statistically significant greater pricing error, that the magnitude of this error has become larger over time, and that ANN has greater relative pricing precision in-sample, as well as out-ofsample. Finally, we measure the relative dollar savings as ANN minimizes pricing error as a mass appraisal tool.
The paper proceeds as follows. Section two contains a brief review of the literature and discussion of the underlying models. We note beforehand that while there is no consensus in the literature, many of the criticisms of neural networks in real estate valuation are based on small samples. One of our contributions to this literature is the large size of our database and the introduction of robust statistical tests as both our training and hold-out samples number in the thousands for each year of the study. Section three discusses the model specifications and the test methodology. Section four presents the results and section five offers some concluding remarks and suggestions for further research.
Artificially Intelligent Hedonic Pricing Models
There are conflicting views on the general relative performance of multiple regression based hedonic pricing models and neural networks. Studies by Tsukuda and Baba (1990), Do and Grudnitski (1992), Tay and Ho (1992), and Huang, Dorsey, and Boose (1994) all found neural networks to be superior to multiple regression. Allen and Zumalt (1994) and Worzala, Lenk, and Silva (1995) suggested otherwise. In a more recent study, Guan, Zuarda, and Levitan (2008) combine fuzzy set theory in neural network architecture to assess property values extending the seminal work of Bagnoli, Smith, and Halbert (1998), who originally applied fuzzy logic to real estate evaluation.
Nguyen and Cripps (2001) compared neural networks to multiple regression models based on a dataset of single family houses and found that neural networks outperformed multiple regression models when the size of the dataset is large. In tbeir study, neural network models tended to overcome functional form misspecification as the sample size increased and while multiple regression performance was relatively independent of sample size, neural network performance improved. Our analysis supports these conclusions.
Other studies find significant nonlinearities between home value and age (Grether and Mieszkowski, 1974; and Do and Grudnitski, 1993) and home value and square footage (Goodman and Thibodeau, 1995). Although the presence of nonlinear mappings from factors such as age, size, and distance to home value, are generally accepted, the cost of excluding mem from the valuation model remains a topic of debate. Our findings suggest that the cost is significant; pricing errors in our linear models are significantly greater than those for our neural networks.
Worzala, Lenk, and Silva (1995) compared two neural networks to multiple regression models in me application of real estate appraisal. Their study, which was based on a small set of transactions within one town,3 concluded that neural networks were not superior to multiple regressions in residential real estate appraisal and warned appraisers who wish to use neural networks to do so with caution citing inconsistent results between software packages, between runs of the same software package, and long run-times. It is indeed true that neural networks can be over-trained resulting in good training runs but poor out-of-sample performance, that there is some sensitivity to outliers in training, and that results may be inconsistent in small samples. Results may also appear inconsistent for the simple reason that network weights are often randomly initialized; mus, grathent descent algorithms produce different solutions for the network's weight vectors simply because they begin iterating from different points on the loss surface. Ensemble averaging (Haykin, 2003) can be used effectively in dealing with this issue. On the other hand, software packages can produce different results simply because they utilize different learning algorithms and performance criteria for training. Many programming languages (we use Matlab) allow the user complete control of network design, training criteria, and simulation design and long run times have been eliminated with advances in processing power. As such, these criticisms, though not without merit, are hardly binding.
To the contrary, Guan, Zurada, and Levitan (2008) argue that neural networks better replicate agents' heuristic thought processes, as well as the imprecision in their decision calculus. Given the volume of research devoted to quasi-rational thought processes, such as various mental accounting rules (Thaler, 1985) and representative heuristics (Kahneman and Tversky, 1972), then neural-based models that incorporate fuzzy rules of logic are exciting developments in the field of property value assessment.
Critics of the neural networks also cite the relative ease of interpretation of hedonic multiple regression models; in particular, partial differentiation of linear models easily isolates each explanatory variable's contribution to value. Although differentiation of neural networks is more difficult given variable interdependencies, it is relatively straightforward to uncover individual variable attributions (Garson, 1991; and Intrator and Intrator, 2001). Thus, while the contribution of, say, square footage, to home value in a neural network cannot be reduced to a single beta, it can nevertheless be assessed by other means (e.g., simulation methods). At any rate, mass appraisal has relied primarily on a multiple regression framework despite the problems associated with nonlinearities, nonnormality of inputs, and multicollinearity.
Hidden nodes serve as "feature detectors" (Haykin, 2003); the signal sent to the output layer to be compared to the target (dependent variable) is a weighting of the hidden nodal output, themselves weighted functions of the inputs. Because hidden nodes weight inputs independently of one another, they present contrasting representations of the relationships between inputs and targets.
Experimental Design and Tests
We study a sample of 46,467 properties transacted over the period 1999-2005. These observations were taken from a database of over 180,000 observations of residential sales in Wake County, NC, part of the Raleigh-Cary Metropolitan Statistical Area. The data are part of the real estate master file, which is county government data used in property valuation and taxation on real estate. The database includes commercial and residential sales data in 20 townships, over 80 specific property types and over 18 property characteristics.
Our primary objective is to compare appraisal performance for linear hedonic models relative to ANNs. To mat end, we test for statistical significance in relative pricing errors using matched paired Mests. Both models share the same inputs-age, number of units, lot size (acreage), number of stories, heated area, number of baths, and a dummy variable indicating exterior composition (wood, masonry, vinyl or aluminum siding). The dependent variable is the observed sale price.4 Summary statistics for these variables are presented in Exhibit 2.
Sampling a proportion p of N homes generates a training group of size pN and a hold-out of size (1-p)N. The experimental design consists of drawing 100 of these randomly selected size pN training samples of homes in each year reserving the remaining (1-p)N home sales for that year as a hold-out sample. Each training sample was used to train the neural network and, separately, to estimate the parameters of the linear model (using OLS).
Absolute pricing errors are computed for the training sample (these are equivalent to regression's "in-sample" results) and the estimated models are used to forecast the prices in the hold-out sample. In this fashion, we generate a random sample of paired pricing errors at the property level. The pricing error differential is the statistic of interest.
That is the basic experiment. This design is extended to accommodate various hold-out and training sample sizes. Specifically, we replicate the experiment for randomly selected training samples of sizes ? equal to 10%, 25%, 50%, and 75% of the population of properties in each year. (The hold-out samples were the complementary samples.) This extension permits us to locate possible bias in model performance especially as it relates to the relative ability of each model to extrapolate out of sample from various training sample sizes.
We also test in-sample and hold-out-sample pricing performance and present differences in root mean squared errors (RMSE), mean absolute pricing errors (MAPE), and the linear model's R-squared statistics.
The pricing error differential is the difference between the linear and neural network absolute forecast errors on a property-by-property basis. Positive differentials favor the neural network. The general pattern in Exhibit 3, Panels A and B, clearly favors the neural network regardless of sample size, and in general, this pattern holds both in-sample and for the hold-out-sample. The size of the pricing error tends to increase over time. To place these results in the proper context, Panel B summarizes mean home values and pricing errors as a percentage of mean value, both in-sample and out-of-sample, as well as for the various holdout sample sizes. Clearly, OLS performance is inversely related to the size the training sample and suffers significantly as it is required to generalize to ever larger hold-out samples. Moreover, pricing errors are larger more recently, which may suggest either, or both, increasing price volatility6 of a shift in the relationship between house price and explanatory variables that are not captured by the linear model; note in Exhibit 4 that R-squared values are virtually constant over time (i.e., the linear model fails to capture the increased volatility in home prices).
It is also interesting that the neural network extrapolates better from larger training sets, e.g., the performance of the linear model is somewhat flat as indicated by the R-squared statistics while the pricing error continues to increase wift training sample size. In many cases, especially in more recent years, this error differential easily exceeds 1.5% of property value per year ($15 million on a $1 billion portfolio). That may seem small, but it is nevertheless statistically significant, and for mass appraisals (say, by mortgage lenders) this represents a considerable annual dead-weight loss potential, e.g., pricing errors lead to default losses, denial of credit, as well as LTV errors.
Additional statistical results on root mean squared error (RMSE) differential and mean absolute pricing errors (MAPE) are given in Exhibit 4.7 Both statistics are error metrics expressed in dollar amounts. The story here reinforces that for the relative pricing error told above. Relative differences in RMSE, in particular, are on the same magnitude as pricing errors and get larger over time (absolutely and proportionately) and with the size of the training group. The same is true for MAPE, which for the linear model, tells the same story as R-squared, i.e., Rsquared values in the neighborhood or 75% suggest mean absolute pricing errors in the 20%-25% range.
We have argued above that there are relevant differences in the manner in which ANN and OLS handle potential data problems, especially as these relate to rank failure in the matrix of regressors. Our point is that the presence of collinearity (or potential for rank failure) is a driving force in model specification. Even near- collinearity influences specification to the extent that standard errors are driven down, which often leads to respecification (we see few studies with insignificant regressors). When the rank condition does not fail, we still observe very high condition numbers for the covariance matrices of regressors (near zero eigenvalues), which suggest instability in the OLS estimates. To illustrate, we constructed a set of dummies covering number of units (truncating these at five or more units), exterior composition (three dummies as noted above), story height (truncating at four or more stories), and number of baths (baths do not exceed ten in number). This data set had a total of 22 dummies but with rank equal to 19. There were therefore three redundant sources of information. While we could not estimate the model with OLS without further restrictions, we could with ANN because backpropagation does not invert the matrix of regressors.
We have shown that linear appraisal methods generate significant mispricing errors relative to a basic feed forward nonlinear artificial neural network. These results are robust; the sample of roughly 46,000 property sales spanning the seven-year period 1999-2005 produces ample degrees of freedom regarding our statistical tests and the randomization scheme for selecting hold-out and training groups reduces any effects due to sampling error. Our major conclusion is that linear hedonic valuation models produce avoidable valuation costs and that these costs are due primarily to nonlinearities in the relationships between property characteristics and value. And, while artificial neural networks may be one of several nonlinear methodologies, methods such as nonlinear least squares are impractical primarily because there is little guidance directing functional form.
Much of the data on property characteristics that hedonic models rely upon are discretely valued as either simple counts (number of baths or stories) or categorical (location code). In such instances, the matrix of explanatory variables consists primarily of a large number of dummy variables, often with failed rank condition. Thus, the covariance matrix of explanatory variables cannot be inverted (without first reducing the number of variables in the regression) or, if invertible, producing very imprecise estimates of the pricing model's coefficients. Backpropagation, on the other hand, splits the influence of redundant information across the nodes and because there is no information matrix to invert, specification searches are not as important to ANN.
In sum, research into hedonic pricing models should argue in favor of nonlinear modeling strategies - artificial neural networks are but one such, easily implemented, method that is relatively neutral to the many data problems that plague OLS. The bottom line in this paper is that pricing errors in linear models are significant, avoidable, and therefore costly and that data problems exacerbate these costs.
1 See, for example, Do and Grudnitski (1992), Worzala, Lenk, and Silva (1995), Goodman and Thibodeau (1998), and Nguyen and Cripps (2001) for which price is linear in me number of bedrooms and baths and where property age enters the regression up to me fourth power.
2 Our data set spans 20 different townships and 19 different planning jurisdictions. Incorporating this information without aggregating townships, for example, would often lead to rank failure.
3 Their training set consisted of 217 properties with a hold-out sample of 71 properties. Do and Grudnitski (1993) studied 105 properties while Tay and Ho (1992) trained on 833 properties and tested their network on a hold-out sample of 222 properties. Our data base consists of 46,467 properties spanning 1999-2005.
4 We also tested the semi-log form in which the dependent variable was the natural log of sales price. This model uniformly underperformed the linear model. Semi-log forms are equivalent to y = exp(Xβ + ε), which is a restriction on the functional form-in this case, a particular nonlinear specification.
5 We test a two-tailed alternative with degrees of freedom pN - 1 or (1 - p)N - 1.
6 Volatility increased 35% from 1999 to 2005.
7 MAPE is formally defined by (ς^sup N^^sub i=1^|P^sub i^ - P^sub i^|//P^sub i^)/N, where P is actual price and P is the predicted price.
8 We borrow from the notation in Hagen, Demuth, and Beale (1997).
Allen, W.C. and J.K. Zumwalt. Neural Networks: A Word of Caution. Unpublished Working Paper, Colorado State University, 1994.
Bagnoli, C, B. Smith, and C. Halbert. The Theory of Fuzzy Logic and its Application to Real Estate Valuation. Journal of Real Estate Research, 1998, 16:2, 169-200.
Baen, J. S. and R.S. Guttery. The Coming Downsizing of Real Estate: The Implications of Technology. Journal of Real Estate Portfolio Management, 1997, 3:1, 1-18.
Detweiler J.H. and R.E. Radigan. Computer-Assisted Real Estate Appraisal: A Tool for the Practicing Appraiser. The Appraisal Journal, 1996, 91-101.
Do, A.Q. and G. Grudnitski. A Neural Network Approach to Residential Property Appraisal. The Real Estate Appraiser, 1992, 58, 38-45.
_____. A Neural Network Analysis of the Effect of Age on Housing Values. Journal of Real Estate Research, 1993, 8:2, 253-64.
Garson, G.D. Interpreting Neural-Network Connection Weights. Artificial Intelligence Expert. 1991, 6, 47-51.
Goodman, A.C. and T.G. Thibodeau. Age-Related Heteroskedasticity in Hedonic House Price Equations. Journal of Housing Research, 1995, 6, 25-42.
Grether, D. and P. Mieszkowski. Determinants of Real Values. Journal of Urban Economics, 1974, 1:2, 127-45.
Guan, J., J. Zurada, and A.S. Levitan. An Adaptive Neuro-Fuzzy Inference System Based Approach to Real Estate Property Assessment. Journal of Real Estate Research, 2008, 30: 4, 395-422.
Hagan, M.T., H.B. Demuth, and M. Beale. Neural Network Design. First edition. Massachusetts: PWS Publishing Co., 1997.
Haykin, S. Neural Networks: A Comprehensive Foundation. Second edition. New Jersey: Prentice Hall, 2003.
Huang, C-S., R.E. Dorsey, and M.A. Boose. Life Insurer Financial Distress Prediction: Neural Network Model. Journal of Insurance Regulation, 1994, 13:2, 131-67.
Intrator, O. and N. Intrator. Interpreting Neural-Network Results: A Simulation Study. Computational Statistics and Data Analysis, 2001, 37:3, 373-93.
Kahneman, D. and A. Tversky. Subjective Probability: A Judgment of Representativeness. Cognitive Psychology, 1972, 3, 430-54.
Nguyen, N. and A. Cripps. Predicting Housing Value: A Comparison of Multiple Regression Analysis and Artificial Neural Networks. Journal of Real Estate Research, 2001, 22:3, 313-36.
Rossini, P.A., P.J. Kershaw, and R.R. Kooymans. Microcomputer Based Real Estate Decision Making and Information Management - An Integrated Approach. Paper presented at the Second Australasian Real Estate Educators Conference, 1992.
Rossini, P.A., P.J. Kershaw, and R.R. Kooymans. Direct Real Estate Analysis - The UPmarket(TM) Approach to Real Estate Decision Making. Paper presented at the Third Australasian Real Estate Educators Conference, 1993.
Shiller, R.J. and A.N. Weiss. Evaluating Real Estate Valuation Systems. Journal of Real Estate Finance and Economics, 1999, 18:2, 147-61.
Thaler, R.H. Mental Accounting and Consumer Choice. Marketing Science, 1985, 4, 199214.
Tsukuda, J. and S.I. Baba. Predicting Japanese Corporate Bankruptcy in Terms of Financial Data Using Neural Networks. Computers & Industrial Engineering, 1994, 27:1-4, 44548.
Tay, D.P.H. and D.K.H. Ho. Artificial Intelligence and the Mass Appraisal of Residential Apartments. Journal of Property Valuation and Investment, 1992, 10:2, 525-39.
Worzala, E., M. Lenk, and A. Silva. An Exploration of Neural Networks and Its Application to Real Estate Valuation. Journal of Real Estate Research, 1995, 10:2, 185-201.
Steven Peterson, Virginia Commonwealth University and Virginia Retirement System, Richmond, VA 23218-2500 or email@example.com.
Albert B. Flanagan, Williams Appraisers, Inc., Raleigh, NC 27607 or benji@ williamsappraisers. com.…
Questia, a part of Gale, Cengage Learning. www.questia.com
Publication information: Article title: Neural Network Hedonic Pricing Models in Mass Real Estate Appraisal. Contributors: Peterson, Steven - Author, Flanagan, Albert B. - Author. Journal title: The Journal of Real Estate Research. Volume: 31. Issue: 2 Publication date: April-June 2009. Page number: 147+. © American Real Estate Society Oct-Dec 2008. Provided by ProQuest LLC. All Rights Reserved.