The assessment of academic staff teaching performance is an area of considerable concern and debate. The questions revolve around what should be assessed and by whom. In this study, the ratings of academic staff and tertiary students in a new institution were compared on 21 criteria of lecturing. Analysis of variance demonstrated that the academics placed significantly greater importance than students on a range of performance criteria (e.g. non-sexist language, independent learning, challenging the world view), with the students placing greater importance on one criterion--pace of the presentation. Separate factor analyses of the ratings by staff and students demonstrated differences in the schematic models of these two groups. They agree that the criteria are important, but portray different pictures of the ways in which the criteria combine to produce the understanding of what is a good lecture. The findings of this study contribute to the questions on the assessment of academic staff performance. This study demonstrates that staff and students differ significantly in their interpretations of what is to be measured in the assessment of a good lecture. These findings raise questions regarding the use of student and staff ratings in performance appraisal.

Since the late 1980s, there has been a movement within Australian higher education towards a formal system of academic staff appraisal. The then Education Minister, John Dawkins (1988), set out in his policy statement on higher education a requirement of tertiary institutions to initiate systematic: procedures for summative staff appraisal to facilitate the rewarding of excellence', assist decision making about tenure and promotion, and ensure accountability of academic staff.

The policy of the Department of Employment, Education and Training (DEET) also alluded to a formative dimension of appraisal. According to DEET's principles of staff appraisal (Lonsdale, Dennis, Openshaw, & Mullins, 1988), one purpose of these procedures is to provide the basis for staff development. Thus academics are able to be assessed, and action taken to remedy problems and to support improvements in teaching.

Teaching appraisal in higher education has used information from: (a) students, (b) colleagues, (c) expert/trained raters, and (d) self-reports (Marsh, 1986; Moses, 1986; Thompson, Deer, Fitzgerald, Kensell, Low, & Porter, 1990). By far the most widely used method has been student appraisal (Cruse, 1987). All the approaches have, however, been shown to suffer from measurement flaws; and a variety of studies has found serious inconsistencies between the judgement by different types of raters.

A number of methodological problems associated with self-appraisal centre on the accuracy of ratings (Meyer, 1980; Thornton, 1980). Self-ratings have been found to suffer from inflation, in comparison with others' judgements, and a tendency for the raters to exhibit socially desirable response patterns (Howard, Conway, & Maxwell, 1985; Moses, 1986), or a self-serving bias (Campbell & Lee, 1988). These errors are considered to have the potential to adversely affect the value of self-ratings. However, self-appraisal is considered to be less problematic when used by individuals to predict their future performance for developmental purposes (Campbell & Lee, 1988, Thompson et al., 1990) rather than assessing their past performances.

Colleague and expert appraisals have been proposed as means of overcoming some of the limitations of self-appraisal. However, colleague and expert appraisals also pose a number of specific problems. For practical reasons, raters are not likely to be as familiar with an appraisee's teaching as the students or the appraisee. Consequently, sampling bias is considered to be a potential problem (Cohen & McKeachie, 1981; Doyle, 1975; Scriven, 1987).

The validity of appraisals, based on limited observations of teaching performance, can be questioned. …

