# Choice Models for Predicting Divisional Winners in Major League Baseball

By Barry, Daniel; Hartigan, J. A. | Journal of the American Statistical Association, September 1993 | Go to article overview

# Choice Models for Predicting Divisional Winners in Major League Baseball

Barry, Daniel, Hartigan, J. A., Journal of the American Statistical Association

1. INTRODUCTION

Major league baseball is played by 26 teams divided into two leagues, the American League and the National League. The American Leage (AL) has two divisions, the AL East and the AL West, each consisting of 7 teams. The National League (NL) has two divisions, the NL East and the NL West, each also consisting of 7 teams. Each of the 26 teams plays 162 games in a season against teams in their own league. At the end of the regular season, the teams with the best record in each division play one another in a best-of-7-game series to decide the league champions. The two league champions then meet in a best-of-7-game series called the World Series.

The data for a particular season is available from the weekly magazine The Sporting News, which publishes a schedule of all games in early March and publishes the results of all games from the previous week during the season. Summary data for the 1991 National League season appears in Table 1. The table presents the results of home and away games for each pair of teams in four 6-week periods.

We will estimate the probability of winning their division for each of the teams, given the results of all games played up to a certain date and the list of remaining games for each team. We wish to allow for differing strengths for each team, for differing home advantages, and for changing strengths over time. It will be seen from Table 1 for example, that Atlanta had a poor record before the All-Star break (39 of 79) and a good record after the break (55 of 83).

We use a choice model for predicting the outcome of each game. The parameters of the model depend on the teams involved, on which team is playing at home, and on time. Markov chain sampling is used to simulate the outcomes of future games and so predict the eventual division winners. We argue that it is necessary to allow for changing team strengths with time, because some teams appear to change noticeably over a season. We present a variety of predictions of outcomes for the 1991 National League season. One prediction, at a certain point in the season when Atlanta was 2 games behind the Dodgers in number of wins, was that Atlanta had a better chance of winning the division than did the Dodgers. This occurred because Atlanta appeared stronger in the second half of the season and that strength was projected to the remaining games. In general, allowing for changing strengths encourages more conservative probability estimates; that is, the team with the best record has a lower probability of winning under the changing strength model than under a fixed strength model, because it is likely to have lower future strength than its record indicates.

Of course, other variables influence the probability of winning. In baseball, the starting pitcher has a substantial effect on the final outcome of each game. We judged that this effect would average out in a relatively short period. Starting pitchers work in a regular rotation, so that the total number of wins in a couple of weeks could be predicted from an estimate of the average ability of the team's starting pitchers. Collecting and analyzing the data for the rather large number of starting pitchers would be a formidable task. We suspect that including this variable would change the estimates for particular games quite a bit, and so the predictions as the season draws to a close (with 4 or 5 games remaining for each team) would be substantially affected. Toward the end of the season, you need to know who is pitching.

Point spreads--the differences in scores between the contending teams-were examined for the National Football League by Harville (1980). Harville predicted future point spreads using past point spreads observed over several seasons. The expected point spread in any game is the difference between a parameter for the home team and one for the away team; these parameters are the same in each year and vary from year to year according to an autoregressive process with lag 1. …

## The rest of this article is only available to active members of Questia

Sign up now for a free, 1-day trial and receive full access to:

• Questia's entire collection
• Automatic bibliography creation
• More helpful research tools like notes, citations, and highlights
• Ad-free environment

Already a member? Log in now.

### Notes for this article

If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

#### Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.
Citations (0)
Some of your citations are legacy items.
Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
• Saved book/article
• Highlights
• Quotes/citations
• Notes
• Bookmarks
Notes
Cite this article

#### Cited article

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

#### Cited article

Choice Models for Predicting Divisional Winners in Major League Baseball
Settings

#### Settings

Typeface
Text size Reset View mode
Search within

Look up

#### Look up a word

• Dictionary
• Thesaurus
Please submit a word or phrase above.
Print this page

#### Print this page

Why can't I print more than one page at a time?

Full screen

## Cited passage

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

## Welcome to the new Questia Reader

The Questia Reader has been updated to provide you with an even better online reading experience.  It is now 100% Responsive, which means you can read our books and articles on any sized device you wish.  All of your favorite tools like notes, highlights, and citations are still here, but the way you select text has been updated to be easier to use, especially on touchscreen devices.  Here's how:

1. Click or tap the first word you want to select.
2. Click or tap the last word you want to select.

## Thanks for trying Questia!

Please continue trying out our research tools, but please note, full functionality is available only to our active members.

Your work will be lost once you leave this Web page.

For full access in an ad-free environment, sign up now for a FREE, 1-day trial.

Already a member? Log in now.