Academic journal article Psychological Test and Assessment Modeling

Test Order Effects in an Online Self-Assessment: An Experimental Study

Academic journal article Psychological Test and Assessment Modeling

Test Order Effects in an Online Self-Assessment: An Experimental Study

Article excerpt

Computer based online self-assessments, where testees can assess their abilities, proficiencies, interests or attitudes without supervision of trained test administrators, are becoming increasingly popular in educational contexts: When it comes to provide support for students to select the field of study that best matches their interests and abilities, online self-assessments are already an important assessment tool in German-speaking countries (for examples see Hornke, Wosnitza & Bürger, 2013, or Kubinger, Frebort, Khorramdel & Weitensfelder, 2012, 2013). For the design and development of such online test batteries some recommendations exist with regard to the use of certain tests and instruments: Kubinger (2015) suggests the use of objective personality tests based on experimental-based behavior tasks in addition to personality questionnaires based on self-ratings, he also enunciates against too brief test batteries, which might not meet psychometric quality standards. However, not much research can be found yet on how such online-assessments should be put together regarding the order of the contained tests - this paper aims to close this gap.

Context effects regarding test order and test length

While also the presentation mode of online vs. traditional testing might have an effect, making it possible that online versions measure partly different constructs (e.g. Buchanan, 2001, 2002), the following study focuses solely on online self-assessments. Previous research regarding test order effects or effects of test length usually focuses on "traditional" testing situations. Hereby, the results are contradictory. While for the change of item order, resulting context effects can be seen as proven (e.g. Franke, 1997; Knowles, 1988; Ortner, 2008), results about effects caused by test order are inconsistent. Khorramdel and Frebort (2011) showed in a sample of managers that test order might have effects on the performance of experimental-based behavior tasks, but they found no effects on cognitive ability tests. The authors concluded that test order rather affects simple than complex tasks. However, these results could not be reproduced in a later study, where no test order effects could be found (Schünemann, 2013).

Test order effects might be closely related to test length effects, since with ongoing test time users might experience fatigue. But several studies find no (e.g. Tulsky & Zhu, 2000) or only small (e.g. Zhu & Tulsky, 2000; Ryan, Glass, Hinds & Brown, 2010) fatigue effects on performance scores. It might sound surprising that longer test length does not necessarily lead to fatigue effects, but that goes along with findings of Davis (1946) who found three patterns of performance change with ongoing test time of a 70minute task: A stable performance (shown by about three-quarters of the sample), an increase of performance (approx. 17%) and a decrease (less than 8%). A study of Ackerman and Kanfer (2009) affirms that performance does not necessarily have to go down with long test length conditions: Testing three versions of the SAT, a test for scholastic aptitude widely used in the U.S. (with test lengths between 3.5 and 5.5 hours), they found that performance even increased in the longer versions, being dissociated with subjective fatigue which rather increased with longer test length. Similar results could be found in a small sample earlier (Liu, Allspach, Feigenbaum, Oh & Burton, 2004), where prolonging the SAT with an essay did not affect performance negatively. Jensen, Berry and Kummer (2013) even showed that a lengthier version of a biology exam resulted in better performance than a shorter version. However, that does not lead to conclusions for the context of self-assessments. In high stakes tests as the SAT, the results have a powerful implication on the testees' educational pathway, which does not necessarily apply to self-assessments.

Effort in low stakes testing

One possible explanation for the dissociation between subjective fatigue and performance as found by Ackerman and Kanfer (2009) was seen in compensation mechanisms: The authors suggested that people realized their fatigue and therefore put in more effort - an explanation that makes sense in high stakes testing situations. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.