This study extended validity evidence for multidimensional measures of coaching competency derived from the Coaching Competency Scale (CCS; Myers, Feltz, Maier, Wolfe, & Reckase, 2006) by examining use of the original rating scale structure and testing how measures related to satisfaction with the head coach within teams and between teams. Motivation, game strategy, technique, and character building comprised the dimensions of coaching competency. Data were collected from athletes (N = 585) nested within intercollegiate men's (g = 8) and women's (g = 13) soccer and women's ice hockey (g = 11) teams (G = 32). Validity concerns were observed for the original rating scale structure and the predicted positive relationship between motivation competency and satisfaction with the coach between teams. Validity evidence was offered for a condensed post hoc rating scale and the predicted relationship between motivation competency and satisfaction with the coach within teams.
Key words: multidimensional Rasch model, multilevel modeling, rating scale effectiveness, satisfaction
**********
Horn's (2002) model of coaching effectiveness is founded on at least three assumptions (see Figure 1). First, both contextual factors and athletes' personal characteristics indirectly influence a coach's behavior through the coach's expectancies, beliefs, and goals. Second, a coach's behavior directly affects athletes' perceptions and evaluations of a coach's behavior. Third, athletes' perceptions and evaluation of a coach's behavior mediate the influence a coach's behavior has on athletes' self-perceptions and attitudes, which, in turn, directly affects athletes' motivation and performance. Because athletes' perceptions and evaluation of a coach's behavior are believed to play a critical role in coaching effectiveness, accurately assessing athletes' evaluations of key coaching competencies is important to the continued coaching improvement and to further development of coaching effectiveness models.
[FIGURE 1 OMITTED]
There are many instruments designed to measure a coach's behavior. (1) The Coaching Behavior Assessment System (CBAS; Smith, Smoll, & Hunt, 1977), the Leadership Scale for Sports (LSS; Chelladurai & Saleh, 1978, 1980), and a Decision Style Questionnaire (DQS; Chelladurai & Arnott, 1985) are three of the most prominent. As reviewed by Horn (2002), these instruments also have been used to assess athletes' perceptions of their coach's behavior (e.g., how often does your coach use positive reinforcement with athletes) and or decision styles (e.g., what decision style would your coach employ to select a team captain). However, none of these instruments measure athletes' evaluations of their coach's behavior (e.g., how competent is your coach in teaching the skills of soccer). While each instrument has contributed to understanding coaching behavior over the last few decades, Smoll and Smith (1989) noted that, "... the ultimate effects that coaching behavior exerts are mediated by the meaning that players attribute to them" (p. 1527). The said instruments do not measure these effects.
Unlike the CBAS, the LSS and the DQS, the Coaching Evaluation Questionnaire (CEQ; Rushall & Wiznuk, 1985) and the Coaching Behavior Questionnaire (CBQ; Kenow & Williams, 1992) were designed to assess athletes' evaluative reactions to specific aspects of their coach's behavior. The CEQ allows athletes to evaluate a coach on his or her personal qualities, personal and professional relationships, organizational skills, and performance as a teacher and a coach. Although the items are suggested to measure separate constructs, scores are to be totaled across items or formed at the item level (Rushall & Wiznuk). Psychometric evidence for the CEQ appears to be limited to item-level test-retest reliability. The CEQ rarely appears in the literature.
The CBQ allows athletes to evaluate their coach's typical behavior, specifically his or her negative activation and supportiveness/emotional composure during competition against a top opponent. Evidence has been provided for the proposed two-factor structure, the internal reliability, and the external aspect of instrument validity (Williams et al, 2003). The CBQ provides a valuable tool to measure athletes' evaluative reactions to aspects of their coach's behavior, but it measures a fairly specific subset of coaching behaviors in a rather targeted scenario (i.e., competition against a top opponent).
Three competency domains stipulated in the National Standards for Athletic Coaches (National Association for Sport and Physical Education, 1995) not fully covered by the CBQ are (a) growth, development and learning of athletes, (b) psychological aspects of coaching, and (c) skills, tactics, and strategies. Within athletes' growth, development, and learning, an expected competency is that a coach provides instruction to develop specific motor skills. Within the psychological aspects of coaching domain, expected competencies include that a coach demonstrate effective motivational skills and conduct practices and competitions to enhance social/emotional growth and promote good sportsmanship in athletes. Within the skills, tactics, and strategies domain, an expected competency is that a coach applies appropriate competitive strategies. The Coaching Competency Scale (CCS; Myers, Feltz, Maier, Wolfe, & Reckase, 2006) was designed to measure athletes' evaluations of their head coach in these areas.
The CCS was developed to measure athletes' evaluations of their head coach's ability to affect athletes' learning and performance. The CCS is a variation of the coaching efficacy scale, developed to measure coaches' efficacy beliefs (CES; Feltz, Chase, Moritz, & Sullivan, 1999). The specific factors measured by the CCS include motivation competence (MC), game strategy competence (GSC), technique competence (TC), and character building competence (CBC; see Figure 2). MC was specified to influence responses to seven items and defined as athletes' evaluations of their head coach's ability to affect athletes' psychological mood and skills. GSC was specified to influence responses to eight items and defined as athletes' evaluations of their head coach's ability to lead during competition. TC was specified to influence responses to six items and defined as athletes' evaluations of their head coach's instructional and diagnostic abilities. CBC was specified to influence responses to four items and defined as athletes' evaluations of their head coach's ability to influence their athletes' personal development and positive attitude toward sport. Myers et al. (2006) provided evidence for the internal model illustrated in Figure 2 via confirmatory factor analysis on the pooled within-cluster covariance matrix. (2) However, validation of measures from an instrument for a particular interpretation is a multifaceted process that cannot be fully addressed in a single study (American Educational Research Association, American Psychological Association, & National Council on Measurement Education [AERA, APA, & NCME], 1999).
[FIGURE 2 OMITTED]
Aspects of the Initial Validity Framework for the CCS Yet to Be Examined
"Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment" (Messick, 1989, p. 13). Important initial steps that have yet to be addressed in validating the CCS include examining how effectively the construct is operationalized by the instrument's rating scale structure and testing how measures of coaching competency relate to other theoretically relevant variables (i.e., an external model).
Rating Scale. A substantive aspect of how effectively coaching competence is operationalized within the CCS is the degree to which athletes used the rating scale structure in the way the authors intended (i.e., systematically). As in the CES, the CCS used a 10-point Likert-type scale with categories ranging from 0 (not at all competent) to 9 (extremely competent). Previous research with the CES (Myers, Wolfe, & Feltz, 2005) and long-standing recommendations for measuring attitudes (Likert, 1932) suggested that this rating scale structure likely contained too many categories and that collapsing data to create fewer category distinctions should be considered to achieve an optimal rating scale structure for the CCS.
Determining an optimal rating scale structure is an important first step in providing validity evidence for measures derived from an instrument, because implementing such a structure increases the accuracy and precision of the resulting measures (Linacre, 2002). Data are typically fitted to an item response theory (IRT) measurement model, in this case a Rasch (1992) model, because select diagnostic statistics have proven useful in determining an optimal post hoc rating scale structure (Zhu, Updyke, & Lewandowski, 1997). While criteria for determining this structure will be detailed in the Method section, a brief conceptual summary is also worthwhile, because applying this technique in sport and exercise psychology is relatively novel.
Rasch models are a family of one-parameter IRT measurement models. IRT is an alternative to true score test theory and is well suited to analyze rating scale data (Wright & Masters, 1982). In this study, the chosen Rasch model depicted person-by-item interactions by defining the probability that athlete n's response to item I would be observed in category k, conditioned on estimates of both the athlete's competency measure (On) and the …