Illustrating the regression effect by a sucker bet

A standard part of freshman statistics is the regression effect, and the Wikipedia article gives a textbook style account. Neither that article nor most textbooks mention that there is one form of readily available data to check one case of this general theory.

Take a sport where teams play in leagues and have a "final standing" each year, typically the proportion of games won, in which case the average over all teams must be 0.5. The regression effect predicts that, for a team with above average performance this year, say a final standing of 0.6, its final standing next year is likely to be less than this year's 0.6.

(Analogously, for a team with below average performance this year, say a final standing of 0.4, its final standing next year is likely to be more than this year's 0.4)

This effect will be most noticeable for the best and worst teams of the year. The table shows data over 25 years. If we had made this prediction each year for each of the top 3 teams, or each of the bottom 3 teams, the final column shows how many of these 75 predictions would have been correct.

"Games" is number of games per season; "teams" is number of teams in the league.
Sport Games Teams Predictions for Proportion correct
U.S. Professional
Hockey8230 Top 3 72%
Hockey8230 Bottom 3 79%
Football 1632 Top 3 83%
Football 1632 Bottom 3 83%
Basketball 8230 Top 3 77%
Basketball 8230 Bottom 3 85%
Baseball 16230 Top 3 66%
Baseball 16230 Bottom 3 85%
European Soccer
U.K. Premier3820Top 360%
Italy. Serie A3820Top 368%
Spain. Primera3820Top 360%
Germany. Bundesliga3418Top 371%
Portugal. Europa3016Top 353%
France. Ligue 13820Top 371%
Netherlands. Eredivisie3418Top 364%

Why does this imply a sucker bet? A self-proclaimed expert in the sport who has not paid attention to freshman statistics might have some specific information -- recruiting a top player in the off-season -- leading them to believe a top team's performance is likely to improve in the upcoming season. This is of course possible, but is more likely to be outweighed by the regression effect.


Data collected by Tung Phan, Fall 2009.