The Effects of Hands-On Experience on Students' Preferences for Assessment Methods
Struyven, Katrien, Dochy, Filip, Janssens, Steven, Journal of Teacher Education
Although changes, invention, creation, design, progress, in short, innovation, are concepts that represent the key issues of educational policy, they are often a source of anxiety and concern for many teacher educators and teachers themselves (Hall, 1979; Van den Berg & Ros, 1999). They are reluctant to readdress, redesign, and reorganize the teaching practices with which they feel comfortable. Consequently; proposals for educational innovation are critically appraised and a negative disposition toward the unknown is apparent (Hargraeves, 2000).
There is often professional unease about unknown subjects, situations, or innovations. It might be an atavistic fear of change, anxiety about chaos, or professional concern for conserving established values and standards (Cuban, 1998; Kelchtermans, 2005; Waugh & Punch, 1987). It is only when familiarity grows with these subjects or situations that fears are allayed and the new, changed situation is accepted (Van den Bergh, Vandenberghe, & Sleegers, 1999; Van de Ven & Rogers, 1988). Fullan (2001) describes the process as follows:
Real change, whether desired or not, represents a serious personal and collective experience characterized by ambivalence and uncertainty, and if the change works out it can result in a sense of mastery, accomplishment and professional growth. The anxieties of uncertainty and the joys of mastery are central to the subjective meaning of educational change, and to success or failure thereof. (p. 32)
Positive experiences with the change, embodied in the sense of mastery, accomplishment, and professional growth, define the success of the change and its continuation in practice.
Moreover, if students are thought of as "participants in a process of change and organizational life" rather than as "potential beneficiaries of change" (Fullan, 2001, p. 151), involving students when studying change and understanding educational innovation comes in natural. In fact, during the process of change, students might suffer similar feelings of uncertainty and ambivalence at the start and joys of mastery, accomplishment, and academic growth when change has proven to work. In this respect, student teachers in teacher training programs are interesting subjects. On one hand, they are students in the process of change when experiencing new teaching methods or assessment modes; on the other hand, they are to serve the function of teachers implementing change in practice.
In addition, empirical evidence repeatedly has shown that teachers tend to teach as they were taught (Johnson & Seagull, 1968), often failing to use the innovative techniques that have been advocated in training (Grippin, 1989). Rather than delivering information about engaging and innovative teaching practices through traditional approaches, modeling the use of these teaching methods serves the purpose (Loughran & Berry, 2005).
Combining the abovementioned arguments, the assumption is made that the modeling and use of new assessment techniques in teacher training might generate initial fearful dispositions with student teachers toward the changes, fears that might be allayed when familiarity grows and the changes tend to work out, with feelings of mastery, accomplishment, and professional/ academic growth as consequences, defining the (possible) adoption of the change in student teachers' current and future teaching practices. Hence, the effects of student teachers' hands-on (read: actual) experience with new modes of assessment in teacher training on their preferences for evaluation methods in general, and the experienced method in particular, are investigated. This study not only examines the dynamics of students' preferences with respect to four assessment methods that follow a standardized course on child development but also aligns instruction and assessment as a traditional learning environment is compared to a student-activating type of instruction that is followed by one of four assessment types. In contrast to former research on students' assessment preferences (Birenbaum, 1997; Darling-Hammond & Snyder, 2000; Sambell, McDowell, & Brown, 1997; Zoller & Ben-Chaim, 1997), studies that have adopted case-study designs, qualitative methods of data gathering, or survey inquiries at one moment in time, the present study is unique for its use of a quasi-experimental design by means of questionnaires administered in three different occasions, that is, prior to, during, and after student teachers have had experience with a particular assessment method.
In recent years, the repertoire of assessment methods in use in higher education has expanded considerably. New modes of assessment have enriched the conventional evaluation setting, formerly characterized by both the multiple-choice examination and the traditional evaluation by essay (Sambell et al., 1997). More recently, portfolios, self- and peer assessment, simulations, and other innovative methods were introduced in higher educational contexts. The notion of alternative assessment is in this respect often used to denote forms of assessment that favor the integration of assessment, teaching, and learning; the involvement of students as active and informed participants; assessment tasks that are authentic, meaningful, and engaging; assessments that mirror realistic contexts; and assessments that focus on both the process and products of learning and move away from single test scores toward a descriptive assessment based on a range of abilities and outcomes (Sambell et al., 1997). The present study compares four assessment methods: one conventional multiple-choice test and three alternative assessment methods, namely, case-based evaluation, peer assessment, and portfolio assessment, and investigates (the dynamics in) student teachers" assessment preferences.
A profound study of educational literature on students' assessment preferences confronted us with three types of studies. First, there are the studies that require students to express their preferences concerning two contrasting evaluation methods. These studies often relate student characteristics to preferences for particular assessment types. For example, Zeidner (1987) concludes that both female and male high school students prefer the multiple-choice format to the essay type of examination. Also Traub and MacRury's (1990) scholastic students report more positive attitudes toward multiple-choice tests on the grounds that these examinations seem easier to prepare, easier to take, and may produce higher relative scores. Nevertheless, these results do not apply to the entire group of students. Birenbaum and Feldman (1998) discovered that university students (in social sciences and arts) with good learning skills, who have high confidence in their academic ability, tend to prefer the essay type of assessment to multiple-choice examinations. Conversely, students with poor learning skills, who tend to have low confidence in their academic ability, prefer multiple-choice testing to the constructed-response type of assessment. Results also show that low test anxiety measures were related to positive attitudes toward the essay format. Students with high test anxiety have more unfavorable attitudes toward open-ended format assessments and a preference for the choice-response type. In contrast to Zeidner (1987), this study also indicated gender differences, with men having more favorable attitudes toward the choice-response format than do women (Birenbaum & Feldman, 1998; Struyven, Dochy, & Janssens, 2005).
The second category of studies also investigates students' assessment preferences for several evaluation methods. In reference to asking students to indicate their appreciation of a whole range of assessment methods, Birenbaum's (1997) results on university students' assessment preferences demonstrate that the highest preference was for teacher-guided test preparation. Next in order came nonconventional assessment types and student participation, followed by higher order thinking tasks and integrated assessment. The least favored category was oral exams. Similarly, Zoller and Ben-Chaim (1997) found that oral examinations of all types were rated as least preferred by college students in both the United States and Israel. The higher stress level associated with oral exams, and the consequent poor performance, are the main reasons for students' dislike of this type of evaluation. College science students' preferred assessment method is nonconventional, written exams in which time is unlimited and any materials are allowed. Moreover, American students like the traditional written exam significantly more than do their Israeli counterparts (Zoller & Ben-Chaim, 1997). Characteristic of both categories of studies is that students' assessment preferences are not related to a particular teaching context in students' higher education programs. Moreover, preferences are measured at one moment and are treated as student characteristics that are fairly stable over time.
However, there is evidence to suggest that students' preferences change due to hands-on experience with a particular teaching environment, which is the focus of the present study. Many researchers have interrogated students about their assessment preferences after experiencing alternative methods of assessment. By means of the case study methodology, Sambell et al. (1997) tried to unveil university students' interpretations, perceptions, and behaviors when experiencing different forms of alternative assessment. Broadly speaking, they found that students often reacted negatively when they discussed what they regarded as "normal" or traditional assessment. Exams had little to do with the more challenging task of trying to make sense and understand their subject. In contrast, when students considered new forms of assessment, their views of the educational worth of assessment changed, often quite dramatically. Alternative assessment was perceived to enable, rather than pollute, the quality of learning achieved (Sambell et al., 1997). However, students repeatedly express that a severe workload tends to alter their efforts in studying (Sambell et al., 1997). Similarly, Slater (1996) found that students in their first year of university physics education had come to like portfolio assessment. Students thought that they would remember much better and longer what they were learning, compared with material learned for other assessment formats, because they had internalized the material while working with it, thought about the principles, and applied concepts creatively and extensively for the duration of the course. Students enjoyed the time they spent on creating portfolios and believed it helped them learn. Segers and Dochy (2001) found similar results in students' perceptions about self- and peer assessment in a problem-based learning environment setting at university (with law and educational science students). Students reported that these assessment procedures (or should they be called pedagogical tools) stimulate deep-level learning and critical thinking. In addition, Macdonald (2002) finds evidence for students' preferences being affected by course experiences at the Open University. In contrast to the feedback received from the undergraduate students, all of the postgraduate students exposed to a project as an end-of-course assessment expressed satisfaction with the project as a final examinable component (Macdonald, 2002). Although it is clear from these results that students' assessment preferences are affected by their experience with a particular assessment method, the lack of pretest measures makes it hard to pinpoint the effect of the experience with evaluation on students' changes in assessment preferences.
The purpose of this study is to provide evidence about the effect of student teachers' hands-on experience of a particular mode of assessment (e.g., multiple-choice test, case-based exam, peer assessment or portfolio) on their assessment preferences using a three-wave longitudinal design. The research project attempts to test the following hypotheses, reported in this article: (a) unknown assessment methods are regarded negatively; (b) as familiarity with an assessment method grows, assessment preferences will change positively; and (c) students' preferences are congruent with students' perceptions of the appropriateness of the assessment methods.
The investigation had a quasi-experimental design because of the authentic educational context in which the study was conducted. Learning materials on the content of child development (a course book, a booklet of assignments, and four evaluation formats) were developed for a set of 10 lessons. These were delivered in five research conditions involving 664 students in their 1st year of the elementary teacher training program of eight participating institutions in the Flemish part of Belgium (N = 664). Students were primarily women (83%) age 18 to 20 years. Lecturers, responsible for the instruction of the course on child development in diverse teacher education institutions in Flanders, were invited to participate in the experiment. Of these, 20 highly motivated, qualified ([greater than or equal to] 5 years of experience in teaching) lecturers participated in the study. Conversations with (the teams of) lecturers in each institution, prior to the experiment, revealed that their interest in active teaching, preliminary (positive) hands-on experiences in class, and the prospects of gaining insight into the effects of student-activating teaching methods and assessment on student learning and perceptions were the foremost intrinsic motives mentioned. The clear-cut student-activating teaching/learning materials to be received was the primary extrinsic motive. During this conversation, each team of teachers within an institution expressed their preferences toward the treatments in the study. First choices of assessment-by-instruction could be assigned to the teacher training lecturers. Student teachers, following the course on child development, were randomly distributed between the classes and lecturers. Apart from the lecture-taught students who followed instruction in large groups (approximately 70 students in each group), class sizes in the activating conditions were limited to a maximum of 40. However, due to absenteeism and the phenomenon of dropout in the 1st year of higher education, class sizes were usually smaller (see Table 1).
In all, there are five experimental groups in this investigation: one lecture-based setting and four student-activating learning environments characterized by one of four assessment modes. Student-activating teaching methods, as defined in this study, challenge students to construct knowledge by means of authentic assignments (a) that literally require their active participation in the learning/teaching process to incorporate the available information by means of discovery learning (b) and with the assistance of a scaffolding teacher (c) who is continuously available to help and coach the student teachers. These three characteristics are essential features of the teaching methods that are used to activate students in this study. The experimental conditions are as follows:
Le: lecture-based learning environment + multiple-choice examination (N = 131)
Mc: student-activating learning environment + multiple-choice examination (N = 119)
Ca: student-activating learning environment + case-based assessment (N = 126)
Pe: student-activating learning environment + peer/ co-assessment (N = 174)
Po: student-activating learning environment + portfolio assessment (N = 114)
The first group of preservice teachers (Le) was instructed in the content of the child development course within a traditional lecture-based learning environment characterized by formal lectures and assessed by means of a multiple-choice examination format. The other four groups (Mc, Ca, Pe, Po) learned in the same student-activating learning environment, characterized by the same learning content and teaching methods. However, the assessment mode that accompanied this learning setting distinguished the conditions from one another. Each group was assigned to a different assessment mode, namely, (a) a multiple-choice test (Mc), (b) a case-based assessment (Ca), (c) a peer/coassessment (Pe), and (d) a portfolio assessment (Po). Because teacher education has a fairly uniform structure in Flanders and a similar intake of 1st-year students (e.g., gender, socioeconomic status, ethnicity, etc.), each institution for teacher education was assigned one of these learning environments. Consequently, all participating student teachers within a particular school were treated the same way by means of standardized learning materials. This procedure precludes bias in the results due to students in different classes comparing one treatment to another. A lack of differences in pretest measures ratifies that groups might be considered comparable (see also Struyven, Dochy, Janssens, & Gielen, 2006). The course on child development is compulsory within the 1st year of the program for preservice elementary school teachers. Table 1 provides a numeric representation of the matching procedures of research conditions, schools, lectures, classes, and student teachers involved in the present study.
Learning Materials for the Course on Child Development
The learning materials consist of three parts: a course book, a booklet of student-activating assignments on the contents studied in the course book, and the four assessment methods that were used to evaluate student performance in the course.
The course book. The content of the course book (Struyven, Sierens, Dochy, & Janssens, 2003) concerns the development of the child, from conception throughout the school age period until adolescence. Each developmental stage in the book was treated in the same way, by a thematic sequence of the domains that characterize the growth of each child (e.g., linguistic development, physical development, motor development, and cognitive development).
The student-activating assignments. Apart from the lecture-based learning environment, all students in the experiment were instructed in the content by means of student-activating assignments (e.g., problem-based learning tasks, case studies, teamwork, etc.), in which students were challenged to become active learners who construct their own knowledge. The assignments required students to select, interpret, acquire, and manage the information available, aimed at its application in real-life cases or solutions to authentic, relevant problems. The teachers' role within the student-activating learning environment was restricted to the supervision and the coaching of the students' learning processes while they were tackling these tasks. The assignments were collaborative in nature (6-8 students) and required shared responsibilities from students. Ten lessons, each of approximately 1.5 hours, grouped the assignments and were the same in the four student-activating groups. Detailed instructions with the assignments were provided to direct both students and teachers. As well as the pre-constructed, standard learning/teaching materials for students and teachers, randomly selected observations in the classes of participating teachers ensured the intended, standardized implementation of the treatments.
The assessment method. Bearing in mind the importance of alignment between instruction and assessment (Biggs, 1996; Segers, Dochy, & Cascallar, 2004), the lecture-based setting was only followed by the multiple-choice test, whereas the case-based evaluation and peer and portfolio assessment required working with the student-activating assignments. With the exception of one group (Le), students were largely unfamiliar with the four assessment methods and the majority of students had no prior hands-on experience of the methods. All methods address the content(s) (and assignments) of the course on child development and aim to measure similar domains of knowledge from acquisition and insight to application and problem-solving skills, the emphasis lying on the latter. Dependent on the sole summative function (Le, Mc, and Ca) or the combined formative function of assessment (Po, [Pe.sup.1]), the purposes of the assessment methods are, respectively, to measure individual differences and to assist learning in combination with the assessment of individual achievement (Pellegrino, 2002).
Each assessment method is discussed in detail and an overview of the characteristics of the tools and procedures is presented in Table 2. The upper half of the table illustrates the characteristics of the end-of-course examination or assessment conversation, whereas the lower part emphasizes the assessment procedure as it is embedded in the course.
The multiple-choice test followed the set of classes and included 20 questions with four multiple-choice answers. Only one choice offers the correct solution to a question. To avoid pick/guess answers, a correction for guessing was used, that is, wrong answers were punished by a withdrawal of 1/