Academic journal article Psychological Test and Assessment Modeling

Reliability and Interpretation of Total Scores from Multidimensional Cognitive Measures – Evaluating the GIK 4-6 Using Bifactor Analysis

Academic journal article Psychological Test and Assessment Modeling

Reliability and Interpretation of Total Scores from Multidimensional Cognitive Measures – Evaluating the GIK 4-6 Using Bifactor Analysis

Article excerpt

In research, latent variable modeling and item response modeling are often used to assess psychological constructs. Nonetheless, especially in practice, when applying a test battery or questionnaire it is common to assess a construct of interest by summing all item scores or subscale scores to form a total score. However, the appropriateness of this procedure depends on several factors. Focusing on the domain of cognitive ability tests, this article will outline how to interpret total scores and assess their reliabilities when the underlying set of items is not unidimensional. The approach based on bifactor analysis (Green & Yang, 2015; Reise, 2012) will then be applied to evaluate the cognitive ability section of the Gifted Identification Kit 4-6 (GIK 4-6; Ziegler & Stoeger, 2016), a newly developed test battery of the Hamdan bin Rashid Al Maktoum Foundation for Distinguished Academic Performance to identify gifted students in the United Arab Emirates. In addition to evaluating factor structure and reliability of the GIK 4-6, the article has two other goals. First, we would like to clarify reliability estimation and interpretation of total scores from multidimensional measures without using formulas or mathematical details. Second, we want to highlight why the problem is ubiquitous in cognitive ability testing and why bifactor analysis is particularly well suited to address the problem for cognitive ability tests.

Currently, the Cattell-Horn-Carroll theory (CHC theory) is the most widely accepted model describing the structure of human cognitive abilities (McGrew, 2009). The model is based on hundreds of factor analyses of cognitive ability tests. Several correlated group factors were found to characterize the broad scope of human cognitive abilities. Usually, a second-order g is also postulated, as the group factors are correlated. Accordingly, cognitive ability test batteries typically consist of different subscales that either correspond to group factors of the CHC theory or are a blend of some of these factors. The subscales are normally summed to a total score; therefore, assuming correlated subscales, a certain percentage of reliable variance of the total score should be due to group factors and another due to g (Brunner & Süß, 2005). When using cognitive ability test batteries, the focus is often on the total score, which is then transformed into an IQ score. This raises two problems: First, how to calculate reliability for such a total score and second, how to interpret the total score?

The most common, but not well understood by many users, reliability coefficient is coefficient alpha (Cortina, 1993; Dunn, Baguley, & Brunsden, 2014; Slocum-Gori & Zumbo, 2011). For coefficient alpha to properly assess reliability essential tau equivalence has to be given, which includes the assumption of unidimensionality (Green & Yang, 2015; Slocum-Gori & Zumbo, 2011). Even though unidimensionality is normally clearly not given in cognitive test batteries as discussed before, coefficient alpha is commonly reported. A much better option to assess reliability when unidimensionality is not given is coefficient omega (McDonald, 1999). Using a factor-analytic framework coefficient omega estimates the proportion of the true score variance of the total score to the total score variance. Depending on which model fits the data best, the true score variance estimate is based either on a single factor or on several factors. It is the proportion of variance of the total score due to all reliable factors (Green & Yang, 2015).

However, the proportion of reliable variance of a total score alone does not inform us as to how it should be interpreted. Although the set of items or subtests underlying a total score of a cognitive ability test is rarely unidimensional, this is not automatically a problem for a meaningful interpretation. For an easy interpretation, it is sufficient that a total score primarily measures the target construct (Reise, Moore, & Haviland, 2010). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.