STATE AND FEDERAL LAW ASSUME THAT THE QUALITY of public education can be gauged by the number of students who reach the "proficiency" mark on a standardized test. Indeed, the federal No Child Left Behind (NCLB) law provides serious penalties for schools that fail to make sufficient annual gains in these numbers. It is a terribly misguided policy.
But the problem is not, as some critics argue, that all tests are invalid. Standardized tests can do a good job of indicating, though not with perfect certainty, whether students have mastered basic skills, can identify facts they should know or can apply formulas they have memorized. Such tests have a place in evaluating schools, as they do in evaluating students. However, they are of little use in assessing creativity, insight, reasoning and the application of skills to unrehearsed situations--each an important part of what a high-quality school should be teaching. Such things can be assessed, but not easily and not in a standardized fashion.
To judge schools exclusively by their test results is, therefore, to miss much of what matters in education. Relying on proficiency benchmarks makes things even worse. NCLB requires that every public-school child in grades three through eight be tested annually in reading and math (and within a few years, periodically in science). The law requires every school to report the percentage at each grade level who achieve proficiency and, separately, the percentage of each racial and ethnic minority group and the percentage of low-income children who achieve it. If every grade and subgroup does not make steady progress toward the national goal--the proficiency of all members in each subject by 2014--the penalties kick in.
But what exactly is "proficiency"? The new testing law models its definition on the one used by the National Assessment of Educational Progress (NAEP), a set of federal exams in a variety of subjects given to a sample of students nationwide. The NAEP tests such a broad span of skills that each test-taker can be asked only a small share of its questions, and the test results must be aggregated to generate average performance numbers. The NAEP then describes these group averages as either "below basic," "basic," "proficient" or "advanced." Panels of citizens decide where the lines between those categories should be drawn.
Proficiency, in other words, is not an objective fact but a subjective judgment. And the NAEP judgments have not been very credible. The NAEP finds, for example, that only 32 percent of eighth graders are proficient in reading, and only 29 percent are proficient in math--seemingly a national calamity. But international tests show that no country in the world has high proportions of its students close to proficiency as defined by the NAEP. If most students in the United States or elsewhere in the world have never been proficient in this sense, how meaningful is it that less than a third of American students are now meeting this target?
In 1993, shortly after the federal government first began reporting scores in terms of proficiency, the General Accounting Office (GAO) charged that the government had adopted this method for political reasons--to send a dire message about school achievement--notwithstanding its questionable technical validity. Confirming the GAO's conclusions, a National Academy of Education report found that the NAEP's definitions of achievement levels were "fundamentally flawed" and "subject to large biases," and that U.S. students had been condemned as deficient using"unreasonably high" standards. A National Academy of Sciences panel rendered a similar judgment.
Nevertheless, under the new federal law, each state must now set its own proficiency standards, and the states are using methodologies similar to the NAEP's. The consequences have often been ludicrous. New York state had to cancel the results of its high-school math exam when only 37 percent of test-takers passed, down from 61 percent the previous year when the curriculum and instructional methods were similar and proficiency was supposed to be defined in the same way. …