# Comparing the Performance of Baseball Players: A Multiple-Output Approach

# Comparing the Performance of Baseball Players: A Multiple-Output Approach

1. INTRODUCTION

Academics and sports fans alike are often interested in a statistical comparison of baseball players. This is hard to do, because baseball is fundamentally a multiple-output sport. For instance, some players are power hitters and excel in hitting home runs, whereas others hit more singles. To directly compare a power hitter like Mark McGwire with someone who hits for average like Tony Gwynn is difficult. In the sporting press, there are many different ways of aggregating performance in different offensive categories into a single number. Examples of such output aggregators include batting average, on-base percentage (OBP), and slugging percentage. Such standard ways of combining different offensive categories into one number entail two related problems. First, they are all imperfect measures of offensive performance. For instance, batting averages do not correct for different player circumstances (e.g., Coors Park is notorious for being a hitter's park, and, accordingly, it is easier for Colorado Rockies play ers to hit well). Second, the weights in all of the output aggregators are somewhat ad hoc. For instance, slugging average weights home runs precisely four times as heavily as singles, batting average weights singles and home runs equally, and OBP weights walks and hits equally. All such weighting choices can be criticized. The purpose of this article is to use statistical methods to address these issues. All of the models used herein address the first issue by correcting for team, year, and league. Addressing the second issue is more complicated; several ways of estimating or calculating output aggregators for offensive performance of baseball players are considered. Individual players are then compared to the benchmark established by a given output aggregator.

Many of the statistical methods used in this article are adapted from the economics literature, and hence some economic terminology is used. That is, baseball players (like firms in an economic context) are viewed as producing outputs given firm characteristics (e.g., a batter produces the "output" hits, which depends on his situation, including what team he plays for, etc.). In an economics context, typically firms operate with different output mixes, just as in baseball batters have different mixes of hits (e.g., singles, doubles, home runs). By looking at the best firms in different regions of output space, the economist can trace out a production possibilities curve, which measures the maximum feasible combinations of outputs that can be produced. In baseball, the best batters can be considered in different regions of output space and trace out a comparable curve. For instance, Mark McGwire might heavily influence the production possibilities curve in the power dimensions, whereas Tony Gwynn might have in fluence in batting average dimensions. Once this production possibilities curve is estimated, individuals can be compared to it. Because the curve reflects best practice, an average player will lie inside the curve. As discussed later, a number between o and 1 can be calculated that reflects how far the player is from the nearest point on the curve. Following the economics literature, to this number is called efficiency. So, for instance, a certain player may have an efficiency of .8. This number has a simple, intuitive interpretation: The player under consideration is only 80% as productive as the best players with a comparable output mix. In other words, the proposed methodology enables comparison of any player with similar players and provides an easily interpretable, single-number summary of a player's performance. Furthermore, the methodology incorporates all of the outputs that a batter produces and corrects for the player's situation (e.g., playing in a particularly good or bad hitter's park).

The output aggregator in this first approach provides a sensible efficiency measure, performance as percentage of best comparable player (see Sec.

