Tests and testing have a long and venerable history. Since the spelling test of Rice (1897), the fatigue test of Ebbinghaus (1897) and the intelligence scale of Binet (1905) the growth of tests has proceeded at an extraordinary pace in terms of volume, variety, scope and sophistication. The field of testing is extensive, so extensive in fact that the comments that follow must needs be of an introductory nature and the reader seeking a deeper understanding will have to refer to specialist texts and sources on the subject. Limitations of space permit no more than a brief outline of a small number of key issues to do with tests and testing. Readers wishing to undertake studies to greater depth will need to pursue their interests elsewhere. In tests, researchers have at their disposal a powerful method of data collection, an impressive array of tests for gathering data of a numerical rather than verbal kind. In considering testing for gathering research data, several issues need to be borne in mind:
|• Are we dealing with parametric or nonparametric tests? |
|• Are they achievement potential or aptitude tests? |
|• Are they norm-referenced or criterion-referenced? |
|• Are they available commercially for researchers to use or will researchers have to develop home produced tests? |
|• Do the test scores derive from a pretest and post-test in the experimental method? |
|• Are they group or individual tests? |
Let us unpack some of these issues.
Parametric and non-parametric tests
Parametric tests are designed to represent the wide population—e.g. of a country or age group. They make assumptions about the wider population and the characteristics of that wider population, i.e. the parameters of abilities are known. They assume (Morrison, 1993):
|• that there is a normal curve of distribution of scores in the population (the bell-shaped symmetry of the Gaussian curve of distribution seen, for example, in standardized scores of IQ or the measurement of people’s height or the distribution of achievement on reading tests in the population as a whole); |
|• that there are continuous and equal intervals between the test scores (so that, for example, a score of 80 per cent could be said to be double that of 40 per cent; this differs from the ordinal scaling of rating scales discussed earlier in connection with questionnaire design where equal intervals between each score could not be assumed). |
Parametric tests will usually be published as standardized tests which are commercially available and which have been piloted on a large and representative sample of the whole population. They usually arrive complete with the backup data on sampling, reliability and validity statistics which have been computed in the devising of the tests. Working with these tests enables the researcher to use statistics applicable to interval and ratio levels of data.
On the other hand, non-parametric tests make few or no assumptions about the distribution of