Academic journal article The Spanish Journal of Psychology

Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

Academic journal article The Spanish Journal of Psychology

Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

Article excerpt

In computerized adaptive testing, the most commonly used valuating function is the Fisher information function. When the goal is to keep item bank security at a maximum, the valuating function that seems most convenient is the matching criterion, valuating the distance between the estimated trait level and the point where the maximum of the information function is located. Recently, it has been proposed not to keep the same valuating function constant for all the items in the test. In this study we expand the idea of combining the matching criterion with the Fisher information function. We also manipulate the number of strata into which the bank is divided. We find that the manipulation of the number of items administered with each function makes it possible to move from the pole of high accuracy and low security to the opposite pole. It is possible to greatly improve item bank security with much fewer losses in accuracy by selecting several items with the matching criterion. In general, it seems more appropriate not to stratify the bank.

Keywords: computerized adaptive testing, item selection rule, item bank security, overlap rate.

En los tests adaptativos informatizados, la función de valoración más comúnmente empleada es la función de información de Fisher. Cuando el objetivo es mantener al máximo la seguridad del banco de ítems, la función de valoración que parece más adecuada es el criterio de proximidad, con el que se valora la distancia entre el nivel de rasgo estimado y el punto donde es máxima la información proporcionada por un ítem. Recientemente, se ha propuesto no mantener la misma regla de valoración constante a lo largo de todo el test. En este estudio, expandimos la idea de combinar el criterio de proximidad con la función de información de Fisher. También manipulamos el número de estratos en los que se divide el banco. Encontramos que la manipulación del número de ítems administrados con cada función hace posible moverse desde el extremo de alta precisión y baja seguridad hasta el extremo opuesto. La selección de varios ítems con el criterio de proximidad hace posible mejorar en gran medida la seguridad del banco con pérdidas escasas en precisión. En general, parece más adecuado no estratificar el banco.

Palabras clave: tests adaptativos informatizados, regla de selección de ítems, seguridad del banco de ítems, tasa de solapamiento.

(ProQuest: ... denotes formulae omitted.)

The lower costs and higher calculation speed of computers have popularised computerized adaptive testing (CAT) as a technique for evaluating educational or psychological contents (van der Linden & Glas, 2010). A CAT allows, when compared with a paper and pencil test, faster and/or more accurate estimation of the examinees' trait level.

The item selection process when a CAT is applied seeks to maximize, at least, two objectives. The first is measurement accuracy. The satisfaction of this objective is commonly measured, in simulation studies, with the root mean squared error (RMSE):

...(1)

where r is the number of examinees, θg is the (real) trait level of the g-th examinee and θg is the estimated trait level for that examinee.

The second objective to maximize is the item bank security. A CAT allows for greater flexibility in test scheduling: two examinees can be evaluated at different moments, with a totally or partially identical item bank. If the first examinee informs the second about the items he received, the second could get correct responses not due to his trait level, but because of the leakage of the bank's content, which would lead to the over-estimation of his trait (H. H. Chang, 2004). The greater the proportion of items that is presented to both examinees, the greater this risk. Overlap rate, defined as the mean proportion of items shared by two examinees (H. H. Chang & Zhang, 2002; Chen, Ankenman & Spray, 2003), is one of the most commonly employed variables for evaluating item bank security. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.