# Item Response Theory

A psychometric test provides a means of measuring a respondent's character traits or abilities. There usually is a reason for measuring these abilities in a respondent. Psychometrics, the field of study regarding the theory and techniques of these test measurements, is concerned with the theories behind the construction and validity of measurement instruments and their evaluation. Psychometricians have developed various theories for designing, analyzing and scoring tests, questionnaires and other measurement instruments.

The classical test theory (CTT), the foundation for measurement theory since the early 20th century, takes into account the empirical findings of tests and measurements. According to CTT, the measurement instrument provides a way to quantify behavior or ability with a numerical score. If a student receives a mark of 90 on a test, it is assumed that individual has 90 percent proficiency of the tested material.

The classical test theory also advocates a true-score theory that assumes a person may receive a low test score because of a number of errors that he or she made through simple bad luck.

The CTT makes no assumptions or predictions. It does not predict if an individual will be able to answer an item correctly or not. This is a limitation of the CTT because for the development of modern tests, it is necessary to be able to predict the probability of the response of every examinee to every item.

The item response theory (IRT) is another paradigm for designing, analyzing and scoring measurement instruments. IRT focuses on the probability of a test respondent answering a test correctly or incorrectly. Unlike CTT, which focuses on empirical data, IRT stresses a latent trait or ability that is assumed to exist because of a respondent's answers to a measurement instrument. IRT scales the difficulty of test items and matches them to the ability of the person, making it possible to compare the difficulty of an item with the proficiency of a person.

The probability of the IRT is based on a mathematical formula called the item response function (IRF). The IRF formula provides the probability that a person at a given level of proficiency will be able to answer a specific item. When necessary, the formula takes into consideration the probability that a respondent will be able to guess the correct answer. For instance, there is always a 25 percent chance that a respondent will be able to guess the correct answer to a multiple choice question with four answers.

Applications of IRT include:

?Item bias analysis. Using the formulas of IRT, it is possible to test each item on a measurement instrument. Does the item apply equally to different groupings such as males and females? Item bias analysis is used by educators to evaluate the validity of their assessments. Psychometricians use it to maintain a bank of test items and questions and to compare the difficulty of different versions of the same exam.

?Equating. IRT provides the justification for using scores on one test to project how a respondent would perform on another test.

? Tailored Testing. Using item response theory, a measurer can estimate a true score that is not based on the number of correct answers, but, rather, on the difficulty of the answered questions. A function of tailored testing is computerized adaptive testing, which adapts to the examinee's level of ability.

Pioneers of the item response theory include:

? Frederic M. Lord (1912–2000), a pyschometrician who worked for the Educational Testing Service. His seminal research on the item response theory was incorporated into his two important works, Statistical Theories of Mental Test Scores and Applications of Item Response Theory to Practical Testing Problems. He has been termed the "Father of Modern Testing."

? Georg Rasch, a Danish pyschometrician. He developed his own theories related to the item response theory that are termed the Rasch models.

? Paul Felix Lazarsfeld was an Austrian sociologist who made great strides in statistical survey analysis.