With forecasting systems mainly based on past performances, understanding regression to the mean is crucial for sports bettors. The 2015-16 EPL season has been no short of surprises so far, raising the question: Are extreme outcomes sustainable? Here’s what statistics have to say about it.
When dealing with random, or mostly random, systems, variables that are more extreme on an initial measurement show a tendency to be less extreme on a second measurement. This phenomenon is called regression toward the mean.
Leicester’s performance during the first of the 2015/16 Premier league season, for example, might gain it a higher team rating than that for Chelsea, who have performed far worse during the same period relative to expectation. But if much of what contributed to their respective team ratings arose as a consequence of chance factors, the phenomenon of regression to the mean would imply that those ratings might not be sustainable going forward.
Measuring team performance
One way to measure the performance of a team is to see how it has performed relative to market expectation. For example, if the odds of a team winning are 2.00, this implies that the market believes it has a 50% chance of victory (discounting the influence of the bookmaker’s margin). If it wins, it has overperformed relative to market expectation; if it fails to win, it has underperformed.
When dealing with random, variables that are more extreme on an initial measurement show a tendency to be less extreme on a second measurement.
Such an approach is qualitatively similar to the Brier Score method, which measures the extent to which a team deviates from what the odds imply. The main difference is that it allows us to measure the direction, as well as the magnitude, of the deviation from expectancy.
Let’s see how Leicester and Chelsea have performed relative to Pinnacle’ expectation over the first 20 games of the 2015/16 Premiership season. For every game a team wins, it receives a risk adjusted score equal to [1 – 1/odds], whilst for every game it fails to win, it receives a score of [-1/odds].
As the season progresses, these scores are summed cumulatively. The tables below reveal that Leicester has performed far better than Pinnacle’ betting market expected them to achieve, whilst Chelsea has performed far worse.
First 20 games for Chelsea
|Opposition||Date||Pinnacle Odds||Result||Profit/Loss||Cimulative Profit/Loss|
|Man City||16/08/15||3.87||No win||-0.26||-0.98|
|Crystal Palace||29/08/15||1.37||No win||-0.73||-1.31|
|West Ham||24/10/15||2.01||No win||-0.50||-2.62|
|Man United||28/12/15||2.95||No win||-0.34||-5.53|
First 20 games for Leicester
|Opposition||Date||Pinancle Sports Odds||Result||Profit/Loss||Cumulative Profit/Loss|
|Man United||28/11/15||3.26||No win||-0.31||2.93|
|Man City||29/12/15||4.25||No win||-0.24||4.49|
How much is performance explained by luck?
A question now arises: should we expect Leicester’s overperformance and Chelsea’s underperformance relative to market expectations to continue? If these trends were largely a consequence of causal factors like player ability and managerial style, then we might expect little regression back towards market expectation; at least not until the market had fully re-evaluated the teams’ new skill levels. If, on the other hand, they were largely a consequence of luck, regression towards the mean should be more rapid and complete.
To determine how much influence regression to the mean, and by implication luck, has on the outcome of soccer matches, we break our data into two halves - the first and second halves of a season - and compare the two. If regression to the mean is small, we would expect extreme performance in the first half to more readily correlate with similarly extreme performance in the second half.
That is to say, performance would show persistence. Alternatively, if regression to the mean is significant, extreme performance in the first half should show little correlation with extreme performance in the second half.
The chart below illustrates this correlation for English football teams from the Premier and Football Leagues over the 2012/13 to 2014/15 seasons. Each of the 276 data points depicts a first half-second half performance pair for each team during a single season. The dark line represents the average trend of the data points.
Correlation of 1st v. 2nd half season performance
As you can see, there is virtually no correlation and an almost perfect regression to the mean. The value of R2 in a correlation plot like this defines how much the variability in one variable accounts for the variability in the second variable.
A figure of 1 implies perfect correlation whilst a figure of 0 implies no correlation at all. Here we can see that the variability in first half season performances explains virtually none of the variability in the second half season performances, implying there is no causal link between the two, and that deviation away from market expectation is essentially a matter of luck.
The 20 most underperforming teams had an average performance of -4.05 during the first half of a season. They regressed to an average of -0.01 for their second half performance, with only one team performing worse. Conversely, the 20 most overperforming teams in any first half season had an average performance of +3.71, which regressed to +0.13 in the second half. Again, only one team showed a more extreme second half performance compared to their first.
Perhaps rating team performance over the course of a season is too long. Much can happen in 38 games (or 46 in the case of the Football League), and expecting any meaningful persistence in performance between the first and second halves of a season is wishful thinking at best.
Furthermore, it can be suggested that if the skill level of a team had genuinely changed, by the second half of the season the betting market would begin to reflect that in the odds, ensuring future deviation from expectation would be less extreme than before anyway, regardless of random processes.
Arguably, we could reduce that influence by considering a much shorter time frame, for example 12 games. Indeed, many quantitative forecasting systems operate over such shorter periods to provide indicators of performance for future games.
Unfortunately, correlation between the performance over 6 previous games and the following 6 games is just as weak. From a total of 1,596 possible correlation pairs, R2 was again 0.00 to two significant figures. Even over a period of just 12 games it would appear that how a football team performs relative to market expectation is almost entirely a matter of luck.
Wisdom of the betting market
If we think about what a betting market like the one offered by Pinnacle actually represents, it will become clear why luck is such a major factor in the outcomes of football bets. We are not arguing that the outcome of a game is simply a matter of chance. Clearly, teams like Arsenal, Manchester City and Chelsea (most of the time) are better than teams like Norwich, Sunderland and Bournemouth, and their superior skill makes it more likely that they will win.
Rather, we are saying that the betting market, through the adjustment of betting odds, takes this differential in team ability into account. Teams perceived as having a greater probability of winning will typically attract a greater betting interest. Hence their odds will be shorter. In effect, the odds act as a kind of handicapping of the skill differential between the teams. Through this handicapping process, the outcomes become more a matter of luck.
When a crowd of bettors expresses opinions through their wagers about the likely outcome of a game, they often arrive at a fairly accurate assessment of the probability of that outcome, through a process akin to market clearing (where demand from backers is balanced by supply from layers). Unsurprisingly, this phenomenon is known as the Wisdom of the Crowd.
Pinnacle is known for some of the wisest and most efficient betting markets of all, and is one reason why it is able to offer the best prices with the smallest margins. A betting market, however, is not completely efficient all of the time. This is particularly so in the period just after it has opened, when relatively few bettors have expressed their opinions and mistakes in pricing might be found.
It is during this period that the sharpest bettors tend to make their wagers at odds that are frequently found to be longer than the closing lines. Nevertheless, outcomes for even the sharpest of bettors can still be dominated by luck in the short term. An awareness of the phenomenon of regression to the mean should help punters better understand the influences of luck and skill.
Ready to bet on the English Premier League? Pinnacle offers extensive, low margin EPL odds available on a host of different markets. Sign-up for an account here, or, if you are still new to world of sports betting, take a look at our very helpful betting articles page that can help you get started straight away.