Truth in Testing: Measuring Up: What Educational Testing Really Tells Us
by Daniel Koretz. Cambridge, MA: Harvard University Press, 2008, 353 pp.
Paul W. Holland
It has been said that "few wish to be assessed, fewer still wish to assess others, but everyone wants to see the scores." Throughout the world, tests are both extolled and disparaged, but they are very unlikely to go away because they provide information that few other sources can, at least at the same price. Tests were administered more than 5,000 years ago for civil service positions in China, and educational testing has been prevalent across the globe since the beginning of the 20th century. And although objections to testing are almost as common as the tests themselves, there seems to be no end to the ways that tests, for better or worse, affect our lives.
In Measuring Up, Daniel Koretz of Harvard's Graduate School of Education gives a sustained and insightful explanation of testing practices ranging from sensible to senseless. Neither an attack on nor a defense of tests, the book is a balanced, accurate, and jargon-free discussion of how to understand the major issues that arise in educational testing.
This book grew out of courses Koretz teaches to graduate students who do not have strong mathematics backgrounds but who do need to know about testing as an instrument of public policy. He has succeeded admirably in producing a volume that will help such students and others who want to see through the current rhetoric and posturing that surrounds testing. Moreover, the book has a wealth of useful information for more technically trained readers who may know more about the formulas of psychometrics than about the realities of how testing is used in practice.
The century-old science of testing has a conceptual core that has matured and developed along with the mathematical structures needed to implement these concepts for the many types of educational tests. This book focuses on these key concepts (not the math) and how they should be used to guide informed decisions about the use and misuse of educational tests.
Koretz's solution to the problem of technical jargon is his effective use of familiar nontesting examples to clarify testing concepts and ideas. For example, political polls use the results from a small sample of likely voters to predict actual voting outcomes and have become ubiquitous. Most readers will have some feeling about what a poll can do and what is meant by its margin of error. Likewise, a test is a sample of information collected from a test taker that, although incomplete, can be reasonably representative of a larger domain of knowledge. Using the test/poll analogy in a variety of ways, Koretz gives a clear account of how to interpret the reliability and uncertainty that are properly attached to test scores, as well as an explanation of the limitations of the test/poll analogy. Most important, Koretz emphasizes that just as a poll has value only to the extent that it represents the totality of relevant voters, performance on a test has value only to the extent that it accurately represents a test takers knowledge of a larger domain.
Koretz reminds us that many important and timely principles of testing were summarized by E. F. Lindquist in 1951, but these principles have been forgotten by many current advocates of accountability testing. Lindquist regarded the goals of education as diverse and noted that only some of them were amenable to standardized testing. He also realized that although the standardization of tests is important for the clarity of what is being measured, standardization also limits what can be measured. For this reason, Lindquist warned that test results should be used in conjunction with other, often less-standardized information about students in order to make good educational decisions.
Koretz provides a very clear discussion of the pros and cons of norm-referenced tests, criterion-referenced tests, minimum competency tests, measurement-driven instruction, performance assessment, performance standards, and most of the types of assessments and their rationales that are now part of the testing landscape. …