EVALUATING EDUCATIONAL INTERVENTIONS: An Educators' Toolkit for Designing Effective Research Studies

EVALUATING EDUCATIONAL INTERVENTIONS: An Educators' Toolkit for Designing Effective Research Studies

Character education programs address moral and ethical values such as respect, responsibility, and trustworthiness (Character Education Partnership, 2010). Programs may be targeted to impact individuals, small groups, whole classrooms, or whole schools (What Works Clearinghouse, 2007) and may be implemented as a practice, supplement or curriculum. Depending on the nature of the program, school administrators tasked with reporting outcomes must assess the effectiveness of a character education program for reducing negative behaviors, enhancing social-emotional skills, and/or improving academic achievement. The magnitude of the behavioral or academic outcome is typically reported in terms of "effect size" in the literature (Schochet, 2005) so that results between studies of various programs can be compared.

Effect size values (magnitude of the outcomes/variability within the system) in the social sciences typically range between Cohen's (1988) designation of 0.2 and 0.8 for small and large effects, respectively. These values can provide a way to compare the effectiveness of character education programs when 20+ individuals, small groups, classrooms, or schools are involved in each study. However, studies that analyze classroom- or school-level outcomes can be very costly when using 20+ classrooms or schools, so school districts may not have the resources to engage in rigorous research studies. How can districts develop effective studies when funding is limited? One way to overcome financial barriers to research is to reduce the number of classrooms or schools required in a study. How can this be done? Districts can limit the required number of classrooms or schools by controlling for the amount of variability that is allowed into the research study such that adequate statistical power is maintained.

Statistical power in educational research is a measure of researchers' ability to have confidence in the outcomes and is related to the effect size value as well as to the number of individuals or groups that participate in the study. The Optimal Design software (Spybrook, Raudenbush, Liu, & Congdo, 2009) that is available on the W.T. Grant Foundation website (2011) was developed to help researchers understand the relationship between the effect size, statistical power, and the number of classrooms or schools used in the study. The software is capable of assisting with study designs typical of the social sciences (i.e., high number of participants, high variability, and low effect size values) as well as those that are more common in the physical sciences (low number of participants, low variability, and high effect size values). Either design is valid, but the study design that is more common in the physical sciences offers districts the ability to limit costs by reducing the number of participants that are required to measure meaningful behavioral and academic changes.

The following discussion demonstrates that administrators and researchers can know in advance whether or not a study designed to evaluate a character education program will likely allow statistically meaningful outcomes to be measured if they occur. Meaningful outcomes can be obtained if the effect of the program is sufficiently large given the number of schools and the variability in the data used in the study. The information provided below can help school researchers answer the following questions: What is a general (and quick) method to approximate the expected effect size using easily accessible school data? How many schools do researchers need to use in a research study so that they can have confidence in the results? If a district has a limited budget, how does it determine which schools are the best candidates for an effective study?


The effect size provides important information about two distinct sets of information: the magnitude of the effectiveness of a program and the variability present in the system. …

