Six bills are before Congress to impose drug testing and penalties on Major League Baseball (MLB) and other professional sports (Kiele 2005). A Senate hearing was held this year in which Jose Canseco, Mark McGwire, and Rafael Palmeiro testified about steroid use in MLB. Many appear to believe that steroid use has caused an increase in home runs. Arguing that Palmeiro's suspension for testing positive for steroids has pushed the issue into the policy arena, Senators Stevens, McCain, and Hall of Fame pitcher and now Senator Jim Bunning are cosponsoring drug-testing bills. The intent of Congress can be inferred from the title of the hearings: "Restoring Faith in America's Pastime."
Before we can reach any conclusions about the contribution of steroids to performance in professional baseball, we first must know something about home run hitting. What was home run hitting like before there were steroids? What is it like now that there is some evidence of steroid use? In a nutshell, the answer is that there are no differences; I will demonstrate that conclusion in this paper.
What then accounts for the rise in MLB home runs? There are more games now; both games and home runs have doubled. Home runs per player have not changed in over 40 yr; the statistics are the same and year-to-year changes are just part of the natural variation. What about the spate of new records? Intermittency is part of home run hitting. When Babe Ruth set records in the 1920s, they came in clusters too. The same thing happened when Roger Bannister broke the 4-min mile; others followed in a burst.
Hitting home runs is an extraordinary feat. Hitting many of them is like winning many PGA championships, several tennis Grand Slams, and multiple World Chess championships. It is of a piece with other high accomplishment as measured by a scientist's citations in the scientific literature, recognition and productivity in the arts, and box office revenues in the movies. I show that the statistical law of home run hitting is the same as the laws of human accomplishment developed by Lotka (Nicholls 1926), Pareto (Pareto 1897), Price (Price 1963), and Murray (Murray 2003). My generalization of these laws is a stable probability distribution with a finite mean and an infinite variance. This makes it a "wild" statistical distribution, far different from the normal (Gaussian) distribution that people are tempted to use in their reasoning about home runs and most other things. Things are not so orderly in home runs; they are rather more like the movies (De Vany 2003) or earthquakes (Samorodnitsky and Taqqu 1994) than dry cleaning.
In 1961 there were 2,730 MLB home runs hit in 1,430 games with 1.909 home runs per league game, 0.041 home runs per player game, and a maximum 61 home runs by a single player. (1) Forty years later, in 2001, 5,458 home runs were hit in 2,429 games with 2.247 home runs per game, 0.0413 home runs per player game, and a maximum of 73 home runs by a single player. Home runs per at bat are the same in both years, 0.075 for players with 200 or more at bats. (2) Home runs per hit are 0.110 in 1961 and 0.125 in 2001, both well within a standard deviation of the 40-yr average. Babe Ruth's record was exceeded in both years.
The annual variation in home runs is driven by the great performances of a few players as in the Maris/Mantle/Gentile year of 1961 (61, 54, and 46 home runs, respectively) or the McGwire/Sosa years of 1998 (70 and 66 with 56 from Griffey) and 1999 (65 and 63 with 45 from Vaughn), or the Bonds/Sosa/Thome year of 2001 (73, 64, and 49). Bonds, McGwire, and Sosa are truly exceptional. Their hitting is 10 standard deviations above the mean (but, caution, home run hitting does not follow a normal distribution so the SD is a measure over the sample, not a property of the distribution itself). Relative to hitters with 200 or more at bats, their performances are about 7 standard deviations above the mean. …