Academic journal article Psychological Test and Assessment Modeling

A Multilevel Item Response Model for Item Position Effects and Individual Persistence

Academic journal article Psychological Test and Assessment Modeling

A Multilevel Item Response Model for Item Position Effects and Individual Persistence

Article excerpt


The paper presents a multilevel item response model for item position effects. It includes individual differences regarding the position effect to which we refer to as the persistence of the test-takers. The model is applied to published data from the PISA 2006 science assessment. We analyzed responses to 103 science test items from ? = 64.251 students from 10 countries selected to cover a wide range of national performance levels. All effects of interest were analyzed separately for each country. A significant negative effect of item position on performance was found in all countries, which is more prominent in countries with a lower national performance level. The individual differences in persistence were relatively small in all countries, but more pronounced in countries with lower performance levels. Students' performance level is practically uncorrelated with persistence in high performing countries, while it is negatively correlated within low performing countries.

Key words: item response theory, item position effects

(ProQuest: ... denotes formula omitted.)

In standardized assessments, performance of test-takers on test items can be affected by the position of the items within the test. Within long tests, performance may decrease due to fatigue or declining motivation. Performance may also increase due to learning or practice effects during the test. Regardless of the direction, these item position effects violate assumptions made in most item response theory (IRT) measurement models, since the probability for correct responses usually is assumed to depend only on properties of items and persons, which are assumed to be independent from presentation conditions and item context. Therefore, the examination of item position effects and, if necessary, their inclusion in an appropriate measurement model is an advisable undertaking from a psychometric viewpoint.

An examination of item position effects is also interesting from an applied perspective. For example, if negative item position effects on performance are known, the maximum test length that can be administered to test-takers without overly impairing the assessed performance can be determined. More importantly, if test scores are described with reference to construct maps which are based on item difficulties (e.g., Wilson, 2005), effects of the item position should be separated from differences in item difficulties due to item content.

Effects of the item position have been repeatedly investigated with different results. For example, Whitely and Dawis (1976) found different difficulties for items presented at different positions within different test forms. Kingston and Dorans (1982, 1984) found positive position effects for some sections of the Graduate record examinations (GRE) aptitude test which they attribute to a practice effect whose occurrence depends on the item type. Meyers, Miller, and Way (2009) found that changes in item positions between test forms are positively related to changes in item difficulty. Davis and Ferdous (2005) found fatigue effects for reading and math tests in grade 3 and 5. Hohensinn et al. (2008) found a small fatigue effect in a 4th grade math assessment, which was however not replicated in a follow-up study (Hohensinn, Kubinger, Reif, Schleicher, and Khorramdel, 201 1). Schweizer, Schreiner, and Gold (2009) and Schweizer, Troche, and Rammsayer (2010) found substantive individual differences in learning effects within the Advanced Progressive Matrices (APM).

Within this paper, we will present an IRT model that can account for different possible item position effects. It is based on the logistic Rasch model for dichotomous responses (with logit(y) = ey/(l + ey)):

P(X^sub m^=1) = logit(θ^sub p^-β^sub i^) (1)

Here, P(X^sub pi^ = l) is the probability for person p answering item i correctly, θ^sub p^ is the ability of person p, and fi is the difficulty of an item i. In the following sections, we will briefly outline general approaches to model item position effects from an item side as well as from an individual differences perspective. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.