Income Distribution in the Third World: Its Estimation Via Proxy Data
Saltz, Ira S., The American Journal of Economics and Sociology
The distribution of income in the Third World has always been an important concern of social scientists in many fields but remains difficult to measure. Since reliable data on the distribution of income exists for only about one third of all developing countries, a serviceable means of their estimation will be useful. The purpose of this paper is to construct a reasonable means of estimating the distribution of income by expanding upon the familiar Kuznets (1955) relationship by including additional variables that are correlated with the distribution of income.(1) The general form of the Kuznets curve can be written:
SHARE = [a.sub.0] + [a.sub.1]Y + [a.sub.2][Y.sup.2] + u 
where SHARE is the share of total income of either the richest or poorest households, Y is Real GDP per capita, and u is a stochastic error term. The aim here is to expand equation  to include variables that will help predict SHARE more precisely than equation .
There are many variables that one can include that are highly correlated with SHARE. The variables chosen in this paper are per capita ownership of automobiles (CARS), the per capita caloric consumption (CAL), and the infant mortality rate (INF). While the results are nearly the same substituting other variables for the ones chosen here, it was found that these three variables best fit criterion described later in the paper.
The construction of the approximation for the distribution of income in this paper involves a two-step process. First, using country data for which the distribution of income is known, we construct an equation to predict the distribution of income using two-stage least squares (2SLS). 2SLS is used to avoid possible simultaneous equation bias. Then, using the parameters estimated, we calculate predicted values of SHARE for a group of countries for which SHARE is not known.
This paper uses income distribution data compiled by the World Bank for 23 developing countries for some period from 1970-80. To single out a particular year would shrink the data set to so few countries (three or four perhaps) as to render any empirical analysis impossible. Thus, this paper operates on the necessary assumption that there are no sudden major shifts in the distribution of income. It is presumed that changes in the distribution of income evolve slowly over time.(2) Thus, using data for the distribution of income within a few years of the midpoint of this decade will allow for a suitably large database without creating serious bias. Accordingly, all other variables used in the empirical analysis to follow are reported for 1975.
The measures for the distribution of income used are LOW20 and HIGH20, the income share of the poorest 20% and richest 20% of households, respectively. The data for LOW20 and HIGH20 are compiled by the World Bank and reported in various issues of the World Development Report.
Using OLS to estimate the Kuznets relation, equation , yields:
LOW20 = 717 - 0.0032Y + 6.83E - 07[Y.sup.2]
R-squared = .301
s.e.e. = 1.39
N = 23 
HIGH20 = 45.6 + 0.011Y - 2.95E - 06[Y.sup.2]
R-squared = .160
s.e.e. = 5.89
N = 23 
where terms in parenthesis below the coefficients are the heteroskedastic-consistent t-ratios (see White 1980), s.e.e. is the standard error of estimation, and N is the number of observations. (See Appendix A for the data and countries used to estimate the parameters of equations  and ). We can see that the Kuznets relationship can explain only 30% of the variation in LOW20 and only 16% of the variation in HIGH20. More importantly from a policy perspective, it is the deviations from the values predicted by  and  that are of interest. Thus, the purpose of this paper is to derive a more efficient and unbiased estimate of LOW20 and HIGH20. …