Short History of Performance Assessment: Lessons Learned
Madaus, George F., O'Dwyer, Laura M., Phi Delta Kappan
The authors offer a historical perspective on performance assessment to help arm people "against surrendering to the panaceas peddled by too many myth makers."
THE PAST EIGHT years have witnessed a sea change in the field of educational measurement. In much of the popular and professional literature, standardized multiple-choice testing is out. Performance assessment, a.k.a. "authentic" or "new" assessment or the "3 P's" performance, portfolios, and products is in. Performance assessment has captured the linguistic high ground, just as the term "minimum competency testing" did in the 1970s. Both slogans have a sensible ring to them, and there is a marvelous litany of claims by devotees of performance assessment that make it difficult to question proponents' assertions.1 Nonetheless, the positive connotations of the words "new" and "authentic" and the beneficial claims made for performance assessment mask its many functions and side effects.
We do not deal directly with these various claims about the benefits of performance assessment but instead offer a historical perspective on performance assessment to help arm people "against surrendering to the panaceas peddled by too many myth makers."2 We show that performance assessment is by no stretch of the imagination a "new" technology. However, the domain of attainments/skills historically assessed by performance testing is different from the domain purportedly assessed by it today.
We begin by placing performance assessment in the context of the uses to which it is put, narrowing our focus to high-stakes uses. Next, we offer a short description of the underlying technology of testing/assessment in order to clarify what we mean by the term performance assessment. We then outline the history of performance testing, dividing it into three periods: premodern (from 210 B.C.E. to 1900 C.E.), modern (1900 to 1980), and postmodern (1980 to the present).3 We argue that changes in assessment technology over the last two centuries from oral to written, from qualitative to quantitative, from short answer to multiple choice were all geared toward increasing efficiency and making the assessment system more manageable, standardized, easily administered, objective, reliable, comparable, and inexpensive, particularly as the numbers of examinees increased. We close with an analysis of the role of performance testing today, arguing that historical issues of fairness, efficiency, cost, and infrastructure continue to cast their shadow on contemporary efforts to use performance assessments in large-scale, high-stakes testing programs.
How Performance Assessment Is Used
There seems to be little argument that performance assessments can affect the curriculum; that statement is almost an educational truism. Indeed, the power of an examination to shape what is taught and learned was noted at least as far back as the 16th century. Foreshadowing many contemporary claims about the power of tests, Philip Melancthon, a Protestant German teacher, wrote in De Studiis Adolescentum, "No academical exercise can be more useful than that of examination. It whets the desire for learning, it enhances the solicitude of study while it animates the attention to whatever is taught."4
The context in which performance assessments are used, however, is critical in evaluating their potential and impact. It is one thing to consider performance assessment as an instructional tool in the hands of classroom teachers. It is quite another when the technology is employed as part of a high-stakes testing program.5 And it is the large-scale deployment of performance testing as a high-stakes policy tool to drive reform and make important decisions about individuals, schools, or systems that is the focus of this article. With regard to classroom uses, suffice it to say that performance-based assessments in the hands of teachers might give them valuable information about what to teach and how to teach it and help them to individualize instruction. …