It's no news to any personnel manager that the rigors of applicant testing are coming under closer judicial scrutiny. By asking the test maker key questions and knowing what answers to look for, recruiters can effectively evaluate tests used for employee selection.
1) HOW WAS THE TEST CONSTRUCTED?
The construction of a legally defensible test requires time and expertise. Testing experts have come to recognize certain stages that must occur in the development of tests that work. A test maker should be able to provide evidence that these stages have been performed through completion.
JOB ANALYSIS. The job in question must be analyzed to develop a description of the tasks and skills important to competent performance.
Actual workers should be consulted in writing the job description, as well as managers and supervisors. Otherwise, the test may be based on an unrealistic job description.
ITEM DEVELOPMENT. Experts in the test's subject matter should work with the testing expert to construct questions that reflect the results of the job analyses. Generally, situational or critical-incident items are better than those that require rote memorization. Such items call on the examinee to demonstrate that he or she has the aptitude, knowledge or work experience to solve problems and to think critically.
PRELIMINARY TESTING. The test needs to be tried and critiqued before it is actually used. People working in the subject area should be asked to take the test and their opinions solicited to identify ambiguities, trivialities, differences of opinion on 'correct' answers and especially, job relevance. Their answers to each test item and the distribution of scores form the basis for the next stage of test development.
STATISTICAL ITEM ANALYSIS. In a statistical item analysis, the distribution scores are divided to form an upper and lower group for comparison purposes.
There must be empirical evidence to show that each item distinguishes between the upper and lower groups. A significantly larger portion of the upper group should be choosing the alternative keyed as correct. The lower group be misled by the wrong alternatives, or "distractors," to a greater degree than the upper group. This implies that distractor seems plausible.
In an ideal test, the upper group will select only the correct, keyed alternatives and the lower group will divide itself equally among all alternatives.
Contrary to common belief, difficult items don't differentiate efficiently between good and poor performance. The difficulty level should be in the 25-75% correct range. One or two easy questions may be used for warm-up and a few very difficult questions may be used to distinguish among the top candidates.
Items are polished until every question meets all necessary criteria: job relevance and importance, clarity and appropriate item statistics. …