A Critical Look at Some Analyses of Major League Baseball Salaries

Article excerpt


Before the 1988 Annual Statistical Meetings, the Statistical Graphics Section of the American Statistical Association made available salary data for 439 major league baseball players, along with various career and 1986 performance statistics and team attendance figures, challenging members of the ASA to analyze the data and present their analyses at a poster session. This exposition was announced at the 1987 Annual Meeting and in the September-October 1987 issue of Amstat News. No detailed instructions were distributed, other than a challenge to answer to question: "Are players paid according to their performance?"

One hundred twenty-seven groups asked for the data, and 15 of these presented analyses. Subsequently, Colin Mallows suggested to the authors (who had not participated in the exposition) that we synthesize lessons from the experience. The 15 presenters were contacted and asked to complete a summary questionnaire and to provide their papers (as prepared for the 1988 Proceedings of the Section on Statistical Graphics).

In this article we review the methods used in those 15 analyses to find a statistical model that responds to the question of whether players are paid for performance. We aim to learn which methods and approaches seem most successful in revealing the structure of these data. Those parallel analyses offer a unique opportunity to compare and constrast a variety of approaches to data analysis. We hope this comparison can provide guidance to others who have data to analyze.

The exposition was not a competition. The groups took many different approaches, and some of these were experimental, reflecting the goal of the exposition to encourage participants to try a range of new methods. Indeed, some of the least "successful" analyses have taught us the most about choosing methods of analysis.

We stress that the use of baseball as a source of data is a choice of convenience. Others have noted that professional baseball offers an unusually rich source of data that are quite complete over a long time period. The ASA challenge took advantage of this wealth of data to present one set of data within a larger framework to many teams of data analysts. The present review examines the methods used by the statisticians; our goal is not to reach a deeper understanding of baseball salaries. We do not propose that any of these analyses would be particularly appropriate for understanding or arbitrating baseball salaries. Indeed, none of the participants professed any sophisticated knowledge of baseball or of previously published analyses of baseball players' salaries. In this, the participants more closely resembled statistical consultants, who often are not expert in the discipline from which the data arise, but who nevertheless are called on to advise and assist in an analysis.

The question of whether salary reflects performance provides a focus for our review. We prefer models that account well for the relationship between performance and salary, and we seek models that are both parsimonious and interpretable. In this discussion we consider the models that were most parsimonious, most interpretable, and best fitting to be the most successful, because these are often good criteria for statistical analyses. Other analyses of baseball players' salaries may have other goals and might therefore lead to other models. We seek to identify the methods that led to the most parsimonious and best fitting models and to understand why these methods worked better than others.

All the analyses were performed with commercially available software. This could not have happened even five years before, and it highlights an important aspect of this review: all the methods used in these analyses are readily available.


Salary data for 439 major league players (263 hitters and 176 pitchers) came from the April 20, 1987 issue of Sports Illustrated; various career and 1986 performance statistics came from the 1987 Baseball Encyclopedia Update; 1986 team attendance figures were obtained from the Elias Sports Bureau. …