State-Level Opinions from National Surveys:
Poststratification Using Multilevel Logistic Regression
DAVID K. PARK, ANDREW GELMAN, JOSEPH BAFUMI
One of the first projects to simulate state-level opinions using national data was undertaken by Pool, Abelson, and Popkin (1965). When the three MIT scholars began their work, computer simulation was only about 20 years old. It was primarily used by engineers who tackled military problems, bridge designs, flight characteristics of new aircraft, and work assignment rules in factories.1 Pool et al. aggregated 64 survey data sets of national respondents from 1952 to 1960. They then used poll, voting, and census data to design 480 voter types based on such factors as income, religion, party, population density, region, race, and sex.2 Differences across states were not attributed to state-level factors but to diffferences in the proportion of voter types. With this assumption and the data in hand, they determined the percent of each voter type who held an opinion of interest, weighted it according to the number of each type in each state, and aggregated to state-level results. Their “best-fit” simulation did quite well with respect to vote choice in 1960. Their estimates differed by only 2.5 percent (in median state error) from the stateby-state election results.
Weber, Hopkins, Mezey, and Munger (1972–1973) undertook a similar project. Their paper took issue with research that emphasized socioeconomic variables as determinates of policy output in the American states. Weber et al. attributed those results to invalid measures of state opinion and instead proposed to estimate state-level opinion in much the