Post Reporting on Statistics Flawed

Article excerpt

A social science colleague once quipped that nobody he knew ever went to journalism school because they loved mathematics and statistics. The comment was triggered by an article in that day's local newspaper which had, to put in gently, misinterpreted some public opinion polling data.

An August 30 Post-Dispatch article on the quantitative methods used to assess residential property in St. Louis County was a dismal reminder about how badly journalists can butcher elementary statistics. The story was largely a bout a Town and Country homeowner appealing his assessment with a manager from the county's assessment division. In classic journalistic style, it employed a conflict narrative: the unhappy citizen versus the apply-the-rule-evenly bureaucrat.

The story was an opportunity for the Post to educate citizens about how St. Louis County (and most other large jurisdictions) use statistical analysis to make assessment estimates. But it soon became evident that the reporter and his editor(s) were quantitatively challenged.

For example, the article stated that "the county employed a computer program called Multiple Regression Analysis." Well, multiple regression analysis is a statistical procedure, not a software package. Capitalizing it would be like capitalizing addition or subtraction.

Beyond that, the article might have explained multiple regression analysis so that readers might understand its role in the process. The technique uses two or more factors (e.g. recent sales prices of nearby homes, building square footage, and lot size) to estimate a single factor (e.g., current market value of a home). St. Louis County has more than 300,000 single-family residences. If it spent, say, $150 for a separate professional assessment for each one, the price tag would be over $45,000,000. Hence the need for an estimation tool.

The article quoted the homeowner, described as "a chemist with a knack for math," asking the bureaucrat to "show me the correlation of coefficient." For starters, there is no "of" between correlation and coefficient. The correct term is "correlation coefficient. …