The enterprise of sports has risen in popularity all over the world and along with it the study of sporting events from various angles. With participation increasing from all quarters of life, cutting across race, gender and other distinguishing human characteristics, the importance of the economics of various sports has also increased dramatically. The revenues associated with various popular sports such as baseball, soccer, football and basketball, to name a few, run into billions of dollars. The scientific study of sports ranges from sports medicine, to various human characteristics of physiology, morphology, and biochemistry of the body suitable for various sports. Special issues and sections of journals have been dedicated to the study of sporting events.
Recently, Stephen Jay Gould (1986) in many of his popular essays has examined the progress of the game of baseball, particularly with regard to hitting, from an evolutionary biologists perspective. A theory in evolutionary biology states that stable systems display decreasing variability. Thus improvement of a system (for the species as a whole) is not typically displayed in a linear growth in average of some desirable attributes but rather in the declining deviation (from the average) of those attributes. This idea that more is not necessarily better is somewhat counter intuitive when encountered for the first time. However, if we view not the individual (and its excellence) but rather the group (and its ability to specialize for a firm grip on a niche for existence), then the declining variation is the proper measure of improvement. Gould's intriguing work on team sports such as baseball display the broader theories of evolutionary biology . Does data from professional basketball yield similar conclusions? We explore this question. The present work involves an examination of data for the National Basketball Association (NBA) over its entire history. Gould's work is reviewed while a brief history of the NBA is also given. The statistical analysis looks at the distribution of several scoring statistics, trends in such statistics and finally how the great players have fared over time and against each other. The analysis here is kept simple because the focus is primarily on understanding the evolutionary history of the game and not in statistical methods per se. The final section gives our summary and conclusions.
A Brief Summary of Gould's Work on Baseball
In baseball, hitting for an average of .400 (number of hits / number at bats) is considered a great feat which has not been duplicated since 1941. Some summary measures of historical data for professional baseball for the twentieth century are given in Table 1. (This table has been extended by us to include the decade of 1990). The table shows that only two players have reached the 0.400 mark and only two others, George Brett (0.390 in 1985) and Tony Gwynn (0.394 in 1994, though based on only half the season because of a strike of major league players), came close to that mark. We know the standards of baseball, and for that matter all modern sports, have undergone dramatic improvement in this century. How, therefore are we to account for the apparent decline in hitting, or to state it in a slightly different form, the disappearance of a 0.400 hitter?
Gould's findings can be summarized as follows: The batting average for all players in a given season is computed and this average for all years remains relatively constant. The batting averages of all players for all years (about 90 years) is about 0.267 with some minor variation. The standard deviations of batting averages, on the other hand, have decreased steadily and surely in an almost law-like fashion over baseball's history starting from 0.0371 and declining to 0.0317 (see Table 1). In this light, we calculate the standardized scores of the four players that are given below:
zCobb = 0.420-0.266/0.0371 = 4.2, zWilliams = 0.406-0.267/0.0326 = 4.3,
zBrett = 0.390-0.261/0.0317 = 4.1, zGwynn = 0.394-0.270/0.0316 = 3.92
Table 1 Summary Statistics of Professional Baseball Players Decade M SD Highest Average 1910 0.266 0.3710 0.420 (Ty Cobb, 1911) 1940 0.267 0.0326 0.406 (Ted Williams, 1941) 1980 0.261 0.0317 0.390 (George Brett, 1985) 1990 0.270 0.0316 0.394 (Tony Gwynn, 1994)
The .400 hitter has indeed disappeared; though the best of the past and the best of the present are similar. This disappearance of a 0.400 hitter is not a decline in the quality of baseball. Rather it is an improvement! What has improved is the average ability of all players as expressed by the declining standard deviations. One must appreciate the fact that because the overall standard of play has improved, it is now difficult to excel in hitting (at the 0.400 level). If we only looked at the averages over time, we would have failed to appreciate the deeper significance of the story lying underneath.
What can we say about scoring and other statistics for an evolving system like the National basketball Association (NBA). This paper takes a broad view of the entire history of the NBA and analyzes the relevant data both from a statistical and a systems perspective.
A Brief History Of The NBA
The history of the National Basketball Association can be readily broken down into three periods: 1946-1950, the 1950s through early 60s, and the time from the 1960s to the present.
During the early years, 1946 - 1950, teams moved frequently and at times represented more than one city. Many of the initial teams did not continue to the present time. In fact, 1970 was the first year in which there were more surviving teams, than teams which had failed. Many of the players on the disbanded teams never returned to the NBA. This phenomenon resulted in more rookies and fewer experienced players participating in NBA games. Many of these all-white players had little experience playing by the NBA's often modified rules.
From the 1950s through the early 1960s, a number of major changes took place in the NBA which served to lay the foundation for the modern system of teams and rules. Teams settled permanently in major cities and a number of new teams arrived on the scene. Failures of existing teams declined dramatically as the era progressed, with the last NBA failure occurring in 1955. Minor changes in the rules of the game have been instituted through out the history of the game primarily to increase fan interest (Jares, 1971, Hill and Baron, 1988 and Pluto, 1992).
The modern era of the NBA began in the mid sixties. After this time, rule changes were fewer and further between, and those rules which were modified had more limited impacts on the game.
What statistics are we going to look at? The most important item to be studied is the number of points scored per minute by players (points per minute, PPM). This is obtained by taking the number of points each player scored in a given year, and dividing it by the number of minutes the player played during that year. Because teams only began keeping track of minutes played in 1951, it is impossible to determine PPM before this time. When appropriate points per game (PPG) were also examined in conjunction with other statistics. All of the data comes from Microsoft CD ROM (1994) and Sachare (1991).
Statistical Analysis of Data
Distribution of the Data
We begin by looking at the distribution of data. In particular we are interested in the normality of the data since so many of our conclusions are dependent on it. We studied both the histogram and Normal Score (NScore) plot of the average PPM for various years. In particular, we will examine the years 1951 and 1993, representing respectively, the early and present era of the history of NBA. The results for 1971, representing the middle era of NBA are also briefly reported. Summary statistics for these three years are given in Table 2.
Table 2 Points Per Minute of NBA Players for Three Selected Years Early, Middle and Present Year Mean Mdn SD IRQ 1951 0.328 0.321 0.0863 0.1128 1971 0.428 0.422 0.1112 0.1563 1993 0.399 0.398 0.1112 0.1418
The histogram of the 1951 data is given in Figure 1A. The distribution for this year is skewed to the right. Figure lB is an NScore Plot for the same data. Its lack of normality is apparent because a straight line fit is poor especially at the tails of the distribution.
The histogram for the 1993 PPM scores is much more normal. The 1971 histogram (not shown here to conserve space) displays considerably more symmetry than the 1951 data though it is somewhat left skewed. By 1993 the data is almost perfectly normal [ILLUSTRATION FOR FIGURE 1C OMITTED]. The NScore Plots of the 1971 and 1993 data bear out these observations. The NScore plot for the 1971 shows considerable deviation from the straight line, while the NScore plot for 1993 data is virtually linear [ILLUSTRATION FOR FIGURE 1D OMITTED]. To conserve space, we omit the figures for 1971, but the general conclusion is that they are more normal than the 1951 data but less that of 1993 data.
A number of trends in player scoring are observable in the history of the NBA. Figure 2A depicts the box plots of points per minute during each of the years for which data are available. The summary statistics for the three years 1951, 1971 and 1993 are given in Table 2 (Statistics that we cite later may not always be found in Table 2 but can be obtained from the authors on request.). Clearly one can infer two distinct trends in this series of box plots. During the 1951 - 1965 period, the median points per minute steadily increased from 0.312 PPM to 0.453 PPM: an increase of 45%. This trend reversed itself with a 12%o decline over the next 28 years.
The interquartile ranges, which can be seen from the widths of the box plots, show graphically the variation in players skill during those years. A larger range occurs when players posses a wider range of ability. When players are more uniform in ability, the range is smaller. The IQR grew by 56% between 1951 and 1963 and then shrank by 19% between 1963 and the present.
The change in the median points per game score is especially dramatic. In 1946 a player on the average scored 4.08 points per game (PPG). This increased to 8.8 PPG in 1966 and then declined to 5.84 PPG in 1993. This rise of 116%, followed by the fall of 34% paints a picture of dramatic increase followed by a period of steady decline. The Inter-Quartile Range confirms this with a rise of 202% followed by a drop of 49% over a similar period.
The average number of points scored by players in each game is also of interest. This statistic follows a similar pattern to the points per minute statistic discussed above. Another continuing downward slope is visible in the last thirty years of the mean points per game . While the average player did steadily increase the number of points he scored during the early years of the game, the later years show a system which is in steady decline, as the average player is increasingly unable to score as many points as the year before. Figure 2B illustrates this with a continuing downward trend during the last thirty years.
We report on the variability of scoring as measured through points per minute. A robust measure of variability is the Inter Quartile range (IQR) which is plotted (along with a smooth of the same) in Figure 3A. We observe low variability in the early years and increased variability in the middle years while an intermediate level is observed in the modern era of professional basketball. We next examine the variability of the scores through the more commonly used (though less robust) measure of standard deviation. The plot is given in Figure 3B. During this period, the standard deviation between NBA players more than doubles. After 1961, this slope reverses itself and begins a continuing decline through the middle era (mid 1970's) and then a flattening out which persists today. The low variability in the early years reflect the performance of a small group of (all white) players of similar ability, the increased variability for performance in the middle era shows the entrance of a large number of talented black players and finally the leveling off reflects the presence a group of players that are more evenly matched.
The Great Players
Thus far, this study has concentrated on the entire NBA population. Since the size of this population grew dramatically during the period under study, it is worthwhile to view a subset of this group to see if similar patterns are observable in their data. The group discussed below consists of the top 15 players for each year, as measured by the average number of points each scored per game for the given season. It turns out that a player who scored 0.75 PPM or above for most seasons qualifies in this list. Since the conclusions for this subset are very similar to that of the whole group we avoid giving tables and figures but simply state the facts (the details are available from the authors).
The standard deviation of the top 15 players breaks into the three periods discussed earlier. The best players were fairly consistent during the early years of the game. However, this uniformity disappears in the 1950s and early 60s, when the standard deviation climbs above 8. With the onset of the modern era the deviation returns to its earlier levels around 3.
While the number of players accomplishing the feat of about 0.75 PPM has grown in recent years, Wilt Chamberlain remains the only player to average more than 1 point per minute for an entire season (1961). The greatest of the great stand out by this measure in a spectacular fashion. Wilt Chamberlain, Jerry West, Kareem Abdul-Jabbar, George Gervin and finally Michael Jordan each came to dominate the game in their own time.
Finally, we look at the data for the scoring champions (highest points per game) for the entire history of the NBA. Figure 4A is a plot of the maximum points per game. The figure reveals an increasing trend in the early years, instability in the middle and then finally a slow reversal.
How have "the greats" progressed (or regressed) over time and how have they fared against each other? This is similar to the question that Gould raised for Baseball, which was summarized in the beginning of the paper. The Z-statistic for each scoring leader for each year is plotted in Figure 4B. The Z-statistics for a scoring leader for a given year were computed by dividing the difference in points per game (of the leader) and average points per game by the standard deviation of points per game for all players (similar to the way it was done for hitting in baseball). The plot reveals very similar information in that the Z-scores improved in the early years of the NBA, steadied in the middle years and have even fallen in the recent years. One startling difference from baseball, though, is that while the baseball's greatest had their Z-scores in the high 3's and low 4's the Basketball's greatests have shown to have Z-scores in the 10's. The only exception of course is the great Wilt Chamberlain who had a Z-score of 29.7 in the year 1961! (in fact Chamberlain's Z-scores were consistently in the 20's).
The Early Years
When taken together, a number of conclusions can be drawn from the analysis presented above. Clearly, the early years were a time where player ability increased across the board. Scoring, in the form of points per minute and points per game, rose steadily during this period, while the amount of variation between players, in the form of standard deviation and IQR, remained level.
This is not a surprising finding as professional basketball was a fairly new sport, with relatively inexperienced players and coaches. The learning curve was much flatter at that time. As participants gained experience, their abilities improved quickly. Increased publicity also caused talented players to be attracted to the sport. As players either improved or dropped out of the league, the overall quality of play improved.
The 1950s and Early 60s
Because of the newness of the sport when the NBA was formed, the rule book was the subject of constant tinkering. The most prominent of the rule changes was the institution of the 24 second rule. The NBA regulators clearly intended for this, and other rules, to reduce fouling as a means for taking possession of the ball. However, by requiring teams to attempt a shot within 24 seconds of taking possession of the ball, scoring increased. This can be seen in the chart of points per minute, which climbs dramatically in 1954 - the year of the rule's inception.
The 1950s also saw the integration of the NBA, beginning with the Boston Celtics signing of Chuck Cooper. By allowing blacks to play, the NBA opened up the pool of potential players to a large number of gifted athletes who had been previously excluded. As current, below average white players were replaced by new and superior black athletes, the scoring ability of the average player, and of teams, rose further.
This period of change is characterized by rising scores and a dramatic increase in player to player variation. This is especially evident in the standard deviation of the top 15 players, which more than doubled during this period.
The Modern Era
As basketball entered the modern era, its worldwide level of popularity increased dramatically, enlarging the number of basketball players from which the NBA could draw. The internationalization of the game expanded recruiting to an increasing number of foreign countries, allowing the best players in the world to play for the NBA. Coaching staffs, both in the NBA and the various college leagues which fed players into the NBA, had the benefit of an increasing number of years of experience. When these facts are taken together, one would expect a steady improvement in the number of points scored per game, but this was not the case.
Beginning in the mid 1960s, basketball expanded steadily into new markets, with the number of teams almost tripling between 1964 and the present. While too dramatic an expansion would certainly have caused player quality to diminish, as demand outstripped supply, this expansion proceeded at fairly regular intervals. Because of the size of the league, the new players needed represented an increasingly small percentage of the number of current players, This was bolstered by the decreasing variability in the quality of the players, as measured by their declining standard deviation and diminishing IQR.
Declining Means and Diminishing Variability
Why then is a player consistently scoring fewer points on the average than in preceding years, even while the difference between the quality of players declines? This is very similar to the situation Gould has observed for the disappearance of the 400 hitter. Apparent decline is really a sign of improvement as improved competition has lowered scores! So the conclusions made about baseball hold fairly well for professional basketball also. True, the lack of an absolute standard such as hitting 0.400 may make the case less dramatic, but all the signs of an evolving system are evident in basketball as well.
Gould, S.J. (1986, August). Entropic homogeneity isn't why no one hits .400 any more. Discover, pp. 60-66.
Hill, R. & Baron, R. (1988). The Amazing Basketball Book, The First 100 Years. New York: Devyn Press.
Jares, J. (1971). Basketball, The American Game. New York: Follett Publishing Company.
Microsoft (1994). Microsoft's Complete NBA Basketball [CD ROM].
Pluto, T. (1992). Tall Tales. New York: Simon and Schuster.
Sachare, A. (Ed.). (1994). The Official NBA Player Directory. New York:Villard Books.…