Academic journal article Journal of International Technology and Information Management

An Innovative Clustering Approach to Market Segmentation for Improved Price Prediction

Academic journal article Journal of International Technology and Information Management

An Innovative Clustering Approach to Market Segmentation for Improved Price Prediction

Article excerpt


A main obstacle to accurate prediction is often the heterogeneous nature of data. Existing studies have pointed to data clustering as a potential solution to reduce heterogeneity, and therefore increase prediction accuracy. This paper describes an innovative clustering approach based on a novel adaptation of the Fuzzy C-Means algorithm and its application to market segmentation in real estate. Over 15,000 actual home sales transactions were used to evaluate our approach. The test results demonstrate that the accuracy in price prediction shows notable improvement for some clustered market segments. In comparison with existing methods our approach is simple to implement. It does not require additional collection of data or costly development of models to incorporate social-economic factors on segmentation. Finally our approach is not market specific and can be easily applied across different housing markets.

Keywords: Fuzzy c-means clustering, mass assessment, k-means clustering, real estate submarkets, cluster homogeneity, ANFIS


There is general agreement that a housing market consists of a set of submarkets. Various methods have been proposed for reliable detection of submarkets. Common methods in determining submarkets are mostly geographic, administratively determined, or statistics-based. Administratively determined boundaries, such as those used by local government assessment offices, can be ineffective (Zurada, Levitan, & Guan, 2011). Though geographic boundaries can be more effective (Fik, Ling, & Mulligan, 2003), such boundaries also restrict segmentation to only spatial considerations. As Goodman and Thibodeau (2007) point out, consumers do not necessarily limit their housing search to spatially contiguous areas when house hunting. Statistical methods often use specially developed neighborhood characteristics such as quality of schools and level of public safety. However, the development of such neighborhood characteristics can be challenging and costly (Goodman & Thibodeau, 2007). This paper explores the feasibility of an untested approach to construct submarkets with the objective to better disaggregate the properties in a given market into more homogeneous clusters. The approach is based on an adaptation of the fuzzy C-means (FCM) algorithm. Our approach is based on a proven and common practice by humans, uses data features readily available to local governments or appraisal offices, and is simple to implement without requiring complicated model development. As is common in such studies, we evaluated the validity of this newly introduced housing segmentation approach by measuring the accuracy of price prediction in the resulting submarkets. A new input variable, based on the prices of comparable properties in their cluster, is added to the typical set of housing characteristics within the newly formed submarkets. Two different methods were used to predict the prices of properties in the clusters, adaptive neural fuzzy inference system (ANFIS) and the traditional multiple regression analysis (MRA). The test results, based on actual sales transactions in Louisville, KY, show that the price prediction accuracy noticeably improves for some of the resulting clusters.

The paper is organized as follows. The next section provides a brief literature review, which is followed in section 3 by a presentation of the adapted FCM algorithm and its application to create homogeneous clusters for properties in a given market. Section 4 provides a description of the data set used in this study. Section 5 presents and discusses the results of testing using both MRA and ANFIS. Finally Section 6 provides a summary and conclusions.


A main obstacle to efforts to obtain an accurate valuation of real estate properties is the heterogeneous nature of real estate data (Goodman & Thibodeau, 2007; Mark & Goldberg, 1988). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.