Reliability Generalization: An HLM Approach

By Wang, Jianjun | Journal of Instructional Psychology, September 2002 | Go to article overview
Save to active project

Reliability Generalization: An HLM Approach

Wang, Jianjun, Journal of Instructional Psychology

Hierarchical data structures have been identified in educational and psychological measurement, and statistical approaches are developed to partition the score variances at multiple levels of the data hierarchy. On basis of the classical test theory and the current HLM literature, the reliability index has been constructed for unconditional and conditional models to facilitate generalization of the reliability computing in various test settings.


Reliability is an important index in educational and psychological measurement. According to a joint committee of the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (1985), "Reliability refers to the degree to which test scores are free from errors of measurement" (p. 19). Since the decrease of measurement errors is often associated with an increase of the measurement consistency in various circumstances, "reliability generalization may provide an important tool for characterizing score equality" (Vacha-Haase, 1998, p. 16). In this study, the purpose is to discuss conditional and unconditional hierarchical models accounting for measurement errors at different levels that are essential to the generalization of reliability. Since the test score reliability depends on many conditions of the test and subjects, empirical factors need to be introduced at the multiple levels to describe these conditions and facilitate generalization of the reliability computing in different settings.

Literature Review

The classical test theory represents one of the cornerstones in educational and psychological measurement (Lord & Novick, 1968; Pedhazur & Schmelkin, 1991). Pedhazur and Schmelkin (1991) recollected: "Since it was proposed by Spearman (1904), the tree-score model, or what has come to be known as classical test theory, has been the dominant theory guiding estimation of reliability" (p. 83). Specifically, Novick, Jackson, and Thayer (1971) elaborated,

   In the classical test theory model, the observed score X on a person is 
   taken to have expectation x, the true score for that person. The error 
   score is defined by e = x - [tau]. The corresponding random variables 
   defined over persons are related by the equation 
   (1.1) X = T + E 
   with  [epsilon] (E|[tau]) = 0 (p. 261) 

Regarding the reliability computing, Novick, Jackson, and Thayer (1971) added,

   The reliability (intraclass correlation) of a test is defined as 
  (1.3) [[rho].sup.2.sub.XT] = [[sigma].sup.2.sub.T]/[[sigma].sup.2.sub.x] = 
  [[sigma].sup.2.sub.T]/([[sigma].sup.2.sub.T] + [[sigma].sup.2.sub.E]) = 
  where X and X' are parallel measurements. (p. 261) 

In a test containing multiple items, student responses to each test item can be treated as an indicator of the true score. Thus, the responses to a set of items comprise multiple indicators of the individual performance. The hierarchical data structure is illustrated by the fact that the item responses are nested within each student. In addition, factors at the student level can be employed to reflect different test conditions, such as the differences in student demographics, past experiences, as well as the instructional coverage of the test contents. Hence, considerations of the multilevel factors are essential to a proper generalization of the reliability assessment in various learning and/or testing environments.

Vacha-Hasse (1998) searched the PsycINFO database for articles published from 1984 to July 1997, and conducted a meta-analysis on issues of reliability generalization. She noted,

   Of the articles reviewed for the present study, 65.76% made absolutely no 
   reference to reliability. At the other extreme, authors of only 13.06% of 
   the articles reported reliability coefficients for the data analyzed in the 
   respective studies. 

The rest of this article is only available to active members of Questia

Sign up now for a free, 1-day trial and receive full access to:

  • Questia's entire collection
  • Automatic bibliography creation
  • More helpful research tools like notes, citations, and highlights
  • Ad-free environment

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
Loading One moment ...
Project items
Cite this article

Cited article

Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

Cited article

Reliability Generalization: An HLM Approach


Text size Smaller Larger
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

While we understand printed pages are helpful to our users, this limitation is necessary to help protect our publishers' copyrighted material and prevent its unlawful distribution. We are sorry for any inconvenience.
Full screen

matching results for page

Cited passage

Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

Cited passage

Welcome to the new Questia Reader

The Questia Reader has been updated to provide you with an even better online reading experience.  It is now 100% Responsive, which means you can read our books and articles on any sized device you wish.  All of your favorite tools like notes, highlights, and citations are still here, but the way you select text has been updated to be easier to use, especially on touchscreen devices.  Here's how:

1. Click or tap the first word you want to select.
2. Click or tap the last word you want to select.

OK, got it!

Thanks for trying Questia!

Please continue trying out our research tools, but please note, full functionality is available only to our active members.

Your work will be lost once you leave this Web page.

For full access in an ad-free environment, sign up now for a FREE, 1-day trial.

Already a member? Log in now.

Are you sure you want to delete this highlight?