MEASURING CONCEPTS: RELIABILITY
Chapter 1 discussed the inevitable imperfection of any single measure as an operationalization of a theoretical concept and the consequent need for multiple operationism. The translation between an abstract theoretical conception and its operational realization is always incomplete. Even so, although all translations (i.e., measures) are imperfect, individual measures vary in the adequacy with which they characterize the underlying conceptual variable of interest. Some measures come closer than others to representing the true value of the concept, in part because they are less susceptible to sources of systematic error or random fluctuation. The quality of a given measure is expressed in terms of its reliability and validity. Briefly, reliability is the consistency with which a measure assesses a given concept; validity refers to the degree of relationship, or the overlap, between an instrument and the construct it is intended to measure.
The concept of reliability derives from classical measurement theory, which assumes that the score obtained on any single measurement occasion represents a combination of the true score of the object being measured and random errors that lead to fluctuations in the measure obtained on the same object at different occasions (Gullicksen, 1950). The standard classical test theory formula, modified slightly to fit with our particular position on reliability, is expressed as follows:
where O =observed score, for example, a score on a math test, a behavioral check-list, or an attitude scale, T =true score, and σer+s =the sum of random and systematic errors that combine with true score to produce the observed score. The standard formula usually lists only random error; it does not take account of systematic error, or combines it with