Academic journal article Psychological Test and Assessment Modeling

Nonignorable Data in IRT Models: Polytomous Responses and Response Propensity Models with Covariates

Academic journal article Psychological Test and Assessment Modeling

Nonignorable Data in IRT Models: Polytomous Responses and Response Propensity Models with Covariates

Article excerpt


Missing data are always a source of concern for statistical analyses. It raises the level complexity of statistical inference. Many researchers, methodologists, and software developers resort to editing the data, although ad hoc edits may do more harm than good by producing results that are substantially biased, inefficient and unreliable (Schafer & Graham, 2002). One way to address the bias in parameter estimates is the identification of the variables that explain the cause of missing data. Below, these explanatory variables will be called ''mechanism or process'' variables. By including a model for this missing data mechanism in the estimation we can reduce or eliminate the bias in parameter estimates.

Theoretically, if all the process variables associated with a particular piece of missing data can be identified and modeled accurately as controls, the impact of the missing data can be statistically adjusted to the point where it is ignorable (Little & Rubin, 1987). In practice, it is difficult to identify these process variables for all cases of missing data. However, if the given data set contains missing observations, the mechanism causing this missingness can be characterized by its variety of randomness (Rubin, 1976) as missing completely at random (MCAR), missing at random (MAR) and not missing at random (NONMAR).

In this article, we focus on responses of persons to items and on item nonresponse. Suppose θ and ζ are the parameters of the observed data and the missing data process, respectively, and D is the missing data indicator with elements dik = 1 if a realization xik was observed and dik = 0 if xik was missing for persons i and items k. Following Rubin's definition, missing data is MAR if the probability of D given the observed data xobs, missing data xmis, and observed covariates y does not depend on the missing data xmis, that is, if

... (1)

Furthermore, the parameters θ and ζare distinct if there are no functional dependencies between them, that is, restrictions on the parameter space (frequentist version) or if the prior distributions of ζ and θ are independent (Bayesian case). (It should be noted that this is a somewhat rough definition, for technical details refer to Rubin (1976), Heitjan (1994, 1997), Heitjan and Rubin (1991), and Jaeger (2005)). If MAR and distinctness hold, the missing data is said to be ignorable, otherwise the missing data are nonignorable. If ignorability holds, we do not have to take the distribution of D and ζ into account, and the consistency of the estimates is not threatened by the occurrence of the missing data.

In the framework of IRT, missing data can be split into four types (Lord, 1974). The first consists of missing observations which result from a priori fixed incomplete test administration and calibration designs. In this case, the missing data are a priori fixed and ignorability trivially holds. That is, p(D|xobs,xmis,ζ,y)=p(D|xobs,ζ,y)=1. The second type consists of classes of response-contingent designs such as two-stage and multistage testing designs and computerized adaptive testing (Lord, 1980). These designs produce ignorable missing data, because the design variables D are completely determined by the observed responses (see, for instance, Mislevy & Wu, 1996). The third type is ignorable missing data that results from unscalable responses such as items missing from booklets and responses such as "do not know" or "not applicable"'. Missing pages can be reasonably viewed as missing at random; "do not know"' or "not applicable" are already suspicious and might fall in the next category of missing data. The reason is that it is not automatically clear whether the given response (don't know or not applicable) is accurate or an instance of avoidance behavior. The fourth and last type of missing data results from a nonignorable missing data mechanism. These will, for instance, occur when lowability respondents fail to give responses to specific items as a result of discomfort or embarrassment. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.