Academic journal article School Psychology Review

Posttest Probabilities: An Empirical Demonstration of Their Use in Evaluating the Performance of Universal Screening Measures across Settings

Academic journal article School Psychology Review

Posttest Probabilities: An Empirical Demonstration of Their Use in Evaluating the Performance of Universal Screening Measures across Settings

Article excerpt

As part of universal screening for academic concerns, educators collect and interpret data from all students in a school system. The purpose of universal screening is twofold in that schools use screening data to (a) assess whether core instruction and curricula are meeting the needs of students and (b) identify individual students that are at risk for later difficulties (Albers & Kettler, 2014). Thus, educators often use screening data to make group-level and student-level decisions. The ability to formatively assess and modify core instruction is vital for successful implementation of tiered support systems in schools (Stoiber, 2014). Further, and related to the purpose of this article, the accurate identification of individual students in need of additional support, particularly for academic concerns, is critical for the effective implementation of those same systems (Catts, Petscher, Schatschneider, Bridges, & Mendoza, 2009; Fuchs, Fuchs, & Compton, 2012). If a school system is unable to accurately identify which students are in need, or meaningfully assess the effectiveness of core practices, subsequent attempts to prevent or remediate student problems are hindered.

Given the purpose of universal screening for academic problems in schools, screening tests are typically expected to be brief but yield results that are reliable and predictive of future performance (Kettler, Glover, Albers, & Feeney-Kettler, 2014). Beyond those basic expectations, the tests and procedures for screening in schools differ somewhat across grades and content areas, with little consensus regarding the best type of instrument to use among students (Mellard, McKnight, & Woods, 2009). For reading, there are many screening tests that have been evaluated by researchers and test developers. The type of behaviors measured by those tests mirror the developmental pattern of reading. Among younger students, there is some evidence that tests of early reading skills (Clemens, Shapiro, & Thoemmes, 2011; Hintze, Ryan, & Stoner, 2003; January, Ardoin, Christ, Eckert, & White, 2016) are effective at predicting proficiency on high-stakes tests, but, as students enter third grade, oral reading fluency is frequently used to identify students at risk for failing to meet proficiency on the state test (Hintze & Silberglitt, 2005; Klingbeil, McComas, Burns, & Helman, 2015). There is a lack of consensus regarding which type of reading assessment should be adopted in multitiered support system frameworks for universal screening among younger and older students. For instance, computer adaptive tests are sometimes used instead of measures of early reading skills or oral reading fluency among younger and older students (January & Ardoin, 2015; Mellard et al., 2009).

Compared to reading, there is even less consensus regarding optimal screening instruments for mathematics. Some evidence exists for the utility of brief measures of numeracy skills for younger students (Lembke & Foegen, 2009; Martinez, Missall, Graney, Aricak, & Clarke, 2009; Methe, Hintze, & Floyd, 2008) and curriculum-based measures for older students (Shapiro, Keller, Lutz, Santoro, & Hintze, 2006; VanDerHeyden, Codding, & Martin, 2017) to predict future performance on high-stakes tests. As with reading, computer adaptive tests have produced promising diagnostic accuracy evidence (Center on Response to Intervention [CRTI], 2016; Shapiro & Gebhardt, 2012). Across reading and math, there is also increasing evidence that existing data (e.g., prior state test performance) may be repurposed by schools for screening without reductions in diagnostic accuracy (Nelson, Van Norman, & VanDerHeyden, 2016; VanDerHeyden et al., 2017; Vaughn & Fletcher, 2010). Thus, any evaluation of optimal universal screening practices for reading and math should compare diagnostic accuracy outcomes from multiple types of tests or sources of information. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.