There are diverse schools of thought on how pupils should be measured or evaluated to show academic achievement in the school setting. Teacher written tests have been used for a least 150 years. There appears, in the past, to have been a lack of knowledge on basic techniques to use in measuring/evaluating learner achievement and progress. Presently, there are student university level textbooks containing guidelines for teachers to use in writing test items. Thus, classroom teacher may receive much assistance then in writing diverse kinds of test items.
Teacher Written Tests
High quality teacher written tests might well possess the best validity of all test writers since they are right in the classroom and might well write test items pertaining to what was taught whereas test writers for state mandated tests are rather far removed from the local classroom. The state writers of test items do not know the children for whom the test was written and thus, cannot provide for individual differences among learners who will be taking the test. Validity is a very salient concept in testing pupils in that learners need to have opportunities to achieve what is in the state mandated objectives. The state mandated objectives can become more valid if the state standards or objective directly relate to the taught curriculum. The classroom teacher may then align the written test items with the stated objectives, in a unit of study.
Test items written by the teacher to cover what was taught might consist of true-false, or multiple choice items. If the test items covered by the teacher in teaching pupils are of high quality, meaning they are very clearly written and no guesswork is involved in their interpretation, high reliability should then also be in the offing. This means that pupils should receive a similar score the second time the same test is taken. With quality reliability then, the test measures consistently when pupils take the same test twice (test/retest reliability), unless learner fatigue sets in.
There is also split half reliability when the test is taken only once and the odd numbered items are compared with the even numbered test items in terms of correct responses from pupils taking the test. The resulting statistic is called split half correlation.
A reliability figure can come about also when a comparison between alternative forms of a test taken by a set of pupil is made. The alternative forms reliability, if high, indicates the two tests are of comparable difficulty and cover similar subject matter from the same unit taught. No matter on what level, local or state, the test items are written, they should possess quality validity and reliability. Again, validity pertains to subject matter taught and measured in terms of learner achievement. Validity then emphasizes what was taught, pupils had opportunities to learn. What is taught then needs to relate to the identified objectives of instruction. Hopefully, very important objectives were emphasized in teaching pupils. Reliability stresses obtaining consistency of pupil results from testing. Inconsistent test results means that a pupil was, for example, on the 25th percentile the first time a test was taken and the eightieth percentile the second time the same test was taken. This does not tell us anything about a pupil's performance or achievement. If a pupil is on the fiftieth percentile both items the same test was taken, then out of every 100 pupils who took the same test, fifty were above and fifty below the median (or average) percentile. The mean is the average score for all pupils taking the test whereas the median is the middle most score. The mode is the most frequent score for those who took the test.
The early years of the twentieth century were conspicuous in the application of science to all phases of business and industry. It was applied not merely to the invention of new products and processes but to the details of organization and management designed to promote economy and efficiency. …