The Record of Aggregate Testing
Progress in the debate over rational deterrence theory, both conventional and nuclear variants, has always depended on the ability of scholars to identify a body of evidence that would be appropriate for testing a wide range of propositions derived from the theory. Unfortunately, notwithstanding the tremendous amount of time and energy spent on producing a suitable list of cases and notwithstanding several noteworthy surveys of the literature ( Jervis 1979; Jervis, Lebow, and Stein 1985; Levy 1988; Pages 1991; Huth, Gelpi, and Bennett 1993; Harvey and James 1992; Harvey 1995), cumulative knowledge about deterrence, both as a theory and as a strategy, remains elusive. It still is unclear whether decision makers have acted according to the logic derived from standard applications of the theory. Moreover, the most prominent aggregate testing strategy, originally designed by Huth and Russett ( 1984, 1988, 1990) and later criticised and revised by Lebow and Stein ( 1987, 1989a,b, 1990), continues to be plagued by ongoing debates over methods and case listings. Lingering divisions over the coding of deterrence successes and failures have become counterproductive, primarily because each side can produce evidence to support its own interpretation of events ( Harvey 1995). Although debates over the accuracy of historical accounts have been constructive, very little effort has been directed towards developing alternative testing strategies that lie outside the success/failure framework or looking at a wider range of propositions derived from the theory.
This chapter unfolds in four stages. First, major problems common to the aggregate research program on nuclear and conventional deterrence are described. Next, criteria of reliability and validity are used