The Verbal Section of the Present Scholastic Aptitude Test
The two preceding chapters have contained the complete results of all items in two tests which will not be used again in the scholastic aptitude test -- synonyms because of their too close association with antonyms, and the logical inference material because of certain structural defects. The present chapter summarizes the results of the three verbal tests now being used -- antonyms, double definitions, and paragraph reading. As the costs of selecting test items by the bi-serial r technique are high and the yield of valid items from this procedure is low, it is not now advisable to publish the items which will be used in the future. The necessity of hoarding items which work well constitutes a serious limitation to this chapter, but the reader may gain some insight into these tests by the study of certain items previously published as well as other items which have proved unsatisfactory.
The practice booklet reproduced in Chapter II gives the instructions and ten sample items of the antonym test. Data showing the stability of items of this test were included in the 1928 Report1 and support the conclusion previously emphasized that the single test items are inherently reliable and afford the original data for study. The 1929 Report2 contained a study of items of the antonym type tested by correlation with an academic criterion and showed that the distribution of answers was similar to that obtained when the total test score was used as the criterion.
The antonym tests used in Forms A, B, and C included fifty items given under ten, eight, and ten-minute time limits, respectively. In Forms D, E, and F one hundred items were included in this test and the time limit was extended to twenty- five minutes. Reliability estimates obtained by correlating odd and even-membered items of this test using the Spearman-Brown formula (2r/1+r) have rather consistently been around .94, but it is felt that this estimate is too high. The correlation between test scores of individuals who have taken two different forms of the test a year apart, however, have been around .90 which indicates that this test material has a high intrinsic reliability. This reliability is due in no small measure to the care used in determining the validity of each item before inclusion in a test, and to the arrangement of items in ten levels of difficulty as shown in Table 143.
Tests of the general type in which the individual tested is required to find a pair of opposites have appeared in many different collections of tests and have uniformly given high correlations with various criteria. Perhaps no test has accumulated more honorary degrees from correlation plots. The University of London has even given this test a g which____________________