Academic journal article School Psychology Review

The Generalizability of Externalizing Behavior Composites and Subscale Scores across Time, Rater, and Instrument

Academic journal article School Psychology Review

The Generalizability of Externalizing Behavior Composites and Subscale Scores across Time, Rater, and Instrument

Article excerpt

Children's externalizing behaviors (e.g., disruptive, aggressive, overactive, and antisocial actions) cause problems for their teachers and their peers in school settings, but most children exhibit externalizing behaviors occasionally. However, the frequency and severity of these behaviors vary widely across children (Hinshaw & Lee, 2003; Kamphaus, Huberty, DiStefano, & Petoskey, 1997). Those children who display frequent and severe externalizing behaviors may not only experience the negative consequences of these behaviors, such as poor school adjustment and impaired peer relationships, but also demonstrate a variety of associated problems, such as academic under-achievement and increased risk of substance use disorders (Barkley, Fischer, Smallish, & Fletcher, 2006; Barriga et al., 2002; Nigg et al., 2006).

Assessments of children's externalizing behaviors may be completed for a number of reasons. They may be conducted to determine the presence of a disorder outlined in the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text revision; DSM-IV-TR; American Psychiatric Association, 2000), to determine whether the child meets eligibility criteria for special education services specified by law (e.g., the Individuals with Disabilities Education Improvement Act, 2004), or to match DSM-IV-TR disorders or behavioral constellations to empirically supported interventions (Kazdin & Weisz, 2003; Kratochwill & Stoiber, 2002). Best practice in behavioral assessment requires practitioners to obtain information using multiple assessment methods completed by multiple sources describing behaviors across multiple settings (McConaughy & Ritter, 2002). School psychologists frequently include behavior rating scales as part of multisource, multimethod assessments because these scales provide a time- and cost-effective way to obtain parent and teacher perceptions of the presence and severity of a child's behaviors in a broad range of problem areas (Kamphaus, Petoskey, & Rowe, 2000; Power & Ikeda, 1996). However, as with most assessment methods, there are basic measurement problems associated with the use of behavior rating scales--namely response bias and error variance (Merrell, 2003). Whereas response bias (or response style) concerns the way that raters approach the task of completing the rating scale (e.g., falling prey to the halo effect or severity effect), error variance stems from the nature of behavior rating scale assessment. Martin, Hooper, and Snow (1986) and Merrell (2003) describe four types of error variance that are particularly applicable to child behavior rating scales: temporal variance, instrument variance, source variance, and setting variance.

Temporal Variance

Temporal variance refers to inconsistencies in behavior ratings over time. Temporal variance may arise from actual changes in a child's behavior (e.g., improvement after intervention) or from inconsistencies in a rater's responses to the same items over time. Thus, variance over time may stem from changes in both the construct (i.e., behavior being rated) and the rater. Based on psychometric standards described in the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999) and on criteria described by previous researchers' application of such standards to evaluate behavior rating scales (e.g., Bracken, Keith, & Walker, 1998; Floyd & Bose, 2003), short-term test-retest reliability coefficients are expected to be strong to very strong. (1) Many studies examining temporal variance of externalizing composites and related subscale scores report coefficients between ratings administered from 2 to 4 weeks apart that meet or exceed .90 for composite scores and .80 for subscale scores (DuPaul, Power, McGoey, Ikeda, & Anastopoulos, 1998; Mattison, Gadow, Sprafkin, Nolan, & Schneider, 2003). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.