The Bias of the RSR Estimator and the Accuracy of Some Alternatives

Article excerpt

William N. Goetzmann (*)

Liang Peng (**)

This paper analyzes the implications of cross-sectional heteroskedasticity in the repeat sales regression (RSR). RSR estimators are essentially geometric averages of individual asset returns because of the logarithmic transformation of price relatives. We show that the cross-sectional variance of asset returns affects the magnitude of the bias in the average return estimate for each period, while reducing the bias for the surrounding periods. It is not easy to use an approximation method to correct the bias problem. We suggest an unbiased maximum likelihood alternative to the RSR that directly estimates index returns, which we term MLRSR. The unbiased MLRSR estimators are analogous to the RSR estimators but are arithmetic averages of individual asset returns. Simulations show that these estimators are robust to time-varying cross-sectional variance and that the MLRSR may be more accurate than RSR and some alternative methods.

The repeat sales regression (RSR), first described by Bailey, Muth and Nourse (1963), is widely used to infer returns of equal-weighted portfolios of assets through time. (1) Most applications of RSR have been in the area of home price index estimation. Indeed, local home indices constructed with the RSR are becoming the benchmarks for home appraisal--the RSR allows a web-based home price estimate that can be used for quick home mortgage assessment and approval. Although it is now becoming a pervasive tool for credit analysis, the RSR has some well-known econometric flaws. (2) One well-known problem of RSR estimators is that they are biased downwards from actual portfolio returns.

This is obviously not desirable because the most common use of any index may be to estimate the current value of its underlying portfolio or of an asset in the portfolio. While equal-weighted portfolios of assets have returns that are arithmetic averages of cross-sectional individual asset returns, the repeat sales estimators are essentially cross-sectional geometric averages. Because of Jensen's inequality, the logarithmic transformation of the price relatives used as a dependent variable in the repeat sales regression results in a bias--the RSR averages logs rather than takes a log of an average. Thus, after getting rid of the log, the RSR estimators are geometric averages instead of arithmetic averages.

Three methods have been suggested to address the bias problem. Shiller (1991) proposes arithmetic-average price estimators for equal-weighted and value-weighted portfolios. The estimators are analogous to the RSR estimators and easy to calculate. Goetzmann (1992) proposes a method that approximates the arithmetic means given RSR estimators, under the assumption that asset returns in each period are lognormally distributed and the cross-sectional variance is constant over time. In another attempt toward unbiased estimators, Geltner and Goetzmann (2000) propose a nonlinear method that minimizes the sum of squared residuals directly without taking logs first.

Though the bias problem of RSR is well known, its source and magnitude may not be well understood. In this paper, we interpret RSR estimators as sample statistics and show how they are simultaneously determined in the regression and how they actually mimic cross-sectional geometric sample means. Specifically, we interpret each RSR estimator as a geometric average of proxies of individual single-period asset returns. As a result, we are able to explicitly decompose the bias of RSR estimators into two components and study them separately.

Our analysis shows that the two components of the bias are respectively determined by two different impacts of the logarithmic transformation of the price relatives: the direct impact and the serial impact. These two impacts push RSR coefficients in opposite directions. Specifically, the direct impact makes RSR coefficients biased downwards, while the serial impact biases them upwards. …