Our Primitive Art of Measurement
Banta, Trudy, Peer Review
Every week brings news of another state legislature enacting 'reforms' of K-12 education, based on students' test scores. Standardized test scores are being used to evaluate, compare, and fail teachers and schools. Will the "press to assess with a test" be focused next on postsecondary education? Will test scores be used to compare colleges and universities? Will public criticism, loss of public funds, and loss of students follow for some institutions? Will faculty feel pressure to raise students' test scores, perhaps narrowing the curriculum to focus on the tested skills?
Certainly the 2006 report of the Commission on the Future of Higher Education, A Test of Leadership: Charting the Future of U.S. Higher Education, prompted moves in this direction with comments about the need for a simple way to compare institutions and public reporting of the results of learning assessments, including value-added measures. Today some one thousand colleges and universities are using one of three standardized measures of generic skills like writing and critical thinking to test first-year students and seniors; now value added can be measured, reported publicly, and compared among institutions.
Unfortunately, very little is being written about what these tests of generic skills are actually measuring and with what accuracy. Virtually nothing is coming out about the validity of the value-added measure. We do know that the institution-level correlation between students' scores on the tests of generic skills and their entering SAT /ACT scores is so high that prior learning accounts for at least two thirds of the variance in institutions' scores. Out ofthat other one third, we must subtract the effects of age, gender, socioeconomic status, race/ethnicity, college major, sampling error, measurement error, test anxiety, and students' motivation to perform conscientiously before we can examine the effects on learning of attending a particular college.
Institutional comparisons inevitably will be made on the basis of the negligible (even 1-2 percent) amount of variance that can be attributed to the contribution of any given college to students' scores on these standardized tests of generic skills. …