- |
- |
- |
X1 |
Y1 |
Z1 |
ph |
pd |
pa |
ph |
ph + pd |
ph + pd + pa |
0.211 |
0.245 |
0.544 |
0.211 |
0.456 |
1 |
Pinnacle is renowned for its efficient betting markets, especially in popular betting sports like soccer. There has been a fair amount of analysis on how or why these markets are efficient, but are they getting even more efficient than before? Read on to find out.
Over the years writing for Pinnacle’s Betting Resources, readers will have become familiar with my talk of market efficiency. Put simply, efficiency is really just another word for accuracy. When we say a market of betting prices is efficient, what we mean is that those prices accurately represent the true probabilities of particular soccer match results.
For some this can be quite a hard concept to grasp. The problem is this: odds reflect probabilities, but results are binary; they are either winners (100%) or losers (0%). In this sense, all odds are inaccurate, since they don’t reflect the outcomes. It requires some imagination to understand that these outcomes represent just one possible history of events, but knowing which one we will see is hidden in the nature of uncertainty.
If we could re-run soccer matches an infinite number of times, we would see a distribution of outcomes, sometimes a home win, sometimes a draw and sometimes an away win. How well the implied probabilities reflect this distribution provides a measure of how accurate or efficient the underlying betting odds are.
How do we measure the accuracy of soccer markets?
Unfortunately, actual results are all we have to work with when trying to measure the accuracy of Pinnacle’s soccer betting markets. We’ll never know what the true underlying probabilities actually are. To measure how well the odds reflect actual outcomes we can use what is called a scoring rule, which measures the accuracy of probabilistic predictions.
Pinnacle has already published a Betting Resources article on one well known scoring rule: Brier Score. Another one is the Ignorance Score, which is related to Shannon Entropy, or the information inherent in a random variable’s possible outcomes. In this article I’m going to introduce another: Ranked Probability Score.
Understanding Ranked Probability Score
The ranked probability score (RPS) is a scoring rule for probabilistic outcomes that considers distance or order. In scoring the accuracy of soccer odds relative to the results this means that a draw is closer to a home win in location than an away win (and vice versa). Its use for soccer first appeared with academic literature in 2012. The equation for calculating RPS is as follows:
where r is the number of potential outcomes (for a soccer match betting market this is 3), and pj and ej are the probability forecasts and observed outcomes at position j. Qualitatively, the RPS represents the sum of the square of the difference between the cumulative distributions of forecasts and observations, and lies between 0 (for a perfect prediction) and 1 (for a wholly imperfect one). A work-through example will help make things a little clearer.
Let’s start with some fair odds (with the bookmaker’s margin removed) for the game played between Manchester United and Manchester City on the 8th March 2020. Based on the average market, the probabilities for a home win, draw and away win (ph, pd and pa) were 0.211, 0.245 and 0.544 respectively. In the event, Manchester United won the game, so eh = 1 whilst ed and ea = 0.
First let’s calculate the cumulative forecast probabilities. The first table shows how.
Now the same for the outcomes.
Finally, calculate the square of the difference between the two for each possible outcome, sum and divide by 2 (since r – 1 = 2 when r is 3).
In the event the least predictable result occurred, so the RPS is quite high. Had Manchester City won the match, the RPS would have been 0.126.
One benefit of using a scoring rule that considers distance is the ability to generate lower scores for draws where teams are evenly matched. Burley drew with Tottenham on 7th March. Both teams were equally rated (35.7%). The RPS was 0.127 despite the draw actually being the least probable of the three results (28.7%). Had either team won, the RPS for the match would have been 0.270.
Despite the draw being the least probable, this intuitively seems to make sense, at least from the perspective of a scoring rule and how well the model probabilities (in this case the odds) reflect real world events, although this reasoning has been disputed.
Using ranked probability scores to estimate market efficiency
In theory we can make an estimate of the efficiency of betting prices in a market by calculating an average RPS over a sample of matches. The lower the score, the more efficient the market and the more accurate the odds model. I’ve done this for a large sample (n = 162,282) of global soccer matches played between 2007 and 2017. The average RPS for Pinnacle’s closing match odds was 0.2046.
Smaller leagues have less information, more uncertainty and more variance; hence Pinnacle will not let customers take advantage of the greater potential for error by limiting the size of stake they will permit.
Without any reference point, it’s difficult to know what that figure means and what it says about the accuracy of the betting odds. At an individual match-level we already know that all the odds are ‘wrong’ in a deterministic sense. But how wrong? A perfect score would be 0, but obviously no odds model is going to be able to achieve that.
The simplest odds model we could use is guessing. Using Excel’s random number generator, I randomised the probabilities for home, draw and away and calculated the RPSs based on the same set of actual match results. The average RPS via in a Monte Carlo simulation was 0.293. Evidently, as a predictive model, Pinnacle’s closing odds are statistically far, far better than random guessing (to the tune of 451 standard deviations!).
Anyone who follows soccer, however, knows that home wins are far more likely than draws and away wins, at least on average. A little bit of checking historical databases will reveal that about 45% of matches finish with a home win, whilst about 27% and 28% are draws and away wins respectively. What if we used those figures for every game in this sample? Now the RPS drops to 0.225, better than random but still much less accurate than Pinnacle’s closing odds.
Opening odds vs. Closing odds
How do Pinnacle’s opening odds compare to their closing odds? Intuitively, most people understand that as a betting market matures, with more action and more opinions reflected by more money, the odds should sharpen.
The average RPS for the sample of matches was 0.2059. That’s higher than for the closing prices although the difference is small. Is such a small difference evidence of increasing price efficiency between market opening and closure?
One way we might check is to see how lucky or unlucky these RPS figures are. Remember, match outcomes are to a very significant extent dominated by chance; this is known as aleatory or statistical uncertainty. We won’t get the same results each time. The actual results provide just one history of 3 to the power 162,282 possible histories (I’ve estimated that number to have about 77,000 figures!).
Instead of using the actual results let’s randomise them, defining their probabilities by Pinnacle’s opening and closing odds themselves to see the expected range of RPSs via a Monte Carlo simulation.
For the closing odds the expected (average) RPS was 0.2045 with a standard deviation of 0.0003, meaning about two thirds of RPS values with the closing odds model would lie between 0.2042 and 0.2048. This included the actual results RPS. About 99.8% would lie within 3 standard deviations, i.e. 2036 and 0.2054. Similarly, the opening odds had a mean of 0.2056 and again a standard deviation of 0.0003.
Since the difference between actual RPS results for opening and closing odds is 0.13 (or over 4 standard deviation), this would suggest a statistically significant difference between the two odds models, implying the closing odds are indeed more efficient (or accurate) than the opening ones. Similarly, a 1-talied t-test on the actual match RPSs for opening and closing odds has a p-value of about 0.001 (roughly equivalent to 3 and a bit standard deviations).
Have Pinnacle’s soccer markets become more efficient?
Let’s turn to the question posed in the title of the resource: have Pinnacle’s soccer match betting odds become more efficient over time? I’ve broken down the RPSs by calendar year and plotted the trend in the chart below.
Despite the considerable year-to-year variance, to be expected given not only the randomness of actual results but also the different matches on account of year to year promotion and relegation, team personnel and form, there appears to be a trend towards greater efficiency.
What is interesting is that closing prices appear to have become more efficient at a faster rate than opening prices. The average opening prices RPS is a reflection of the accuracy of Pinnacle’s odds setting model. By contrast the average closing odds RPS is a reflection of all the model of all Pinnacle’s customers in addition to its own. That closing prices appear to have become relatively more efficient between 2007 and 2017 would be indicative of a greater number of customers competing both with Pinnacle and passively amongst themselves in an ‘arms race’ towards ever greater predictive accuracy.
Which soccer leagues have become more efficient?
It’s often suggested that it’s much easier to beat the bookmaker in a smaller, less efficient market than one that is heavily traded by large numbers of customers. It’s a perfectly reasonable suggestion, especially since Pinnacle applies different staking limits to different leagues specifically for the purposes of managing risks.
Smaller leagues have less information, more uncertainty and more variance; hence Pinnacle will not let customers take advantage of the greater potential for error by limiting the size of stake they will permit. A minor European lower division might have stake limits as low as a few hundred dollars. By contrast, a match in the Premier League or Champion League can see limits as high as $45,000.
The chart below breaks down the data from the first chart into ‘Big’ versus ‘Small’ soccer leagues/competitions. My choice of what was ‘Big’ and ‘Small’ was rather arbitrary and subjective, but hopefully based on common sense. Thus, ‘Big’ included the top divisions in England, Scotland, Spain, Italy Germany and France, plus the Champions League, Europa League, European Cup and World Cup, about 15% of the total sample.
Two points can be made. Firstly the ‘bigger’ markets have a lower average RPS than the ‘smaller’ ones. Indeed, the difference is statistically enormous. For both opening and closing prices the probability of this happening by chance is about 1 in 50 billion. Secondly the ‘bigger’ markets have trended to lower RPS, and by implication greater efficiency, faster. Indeed, the ‘smaller’ ones have barely changed at all.
Why would bigger markets have seen a faster trend towards efficiency? A possible explanation is that customers’ interest in the big markets has grown faster relative to the smaller markets. That might be understandable giving that the increasing output of online and TV advertising of sports betting has focused on the biggest competitions.
However, is the statistically lower average RPS of bigger markets really caused by greater efficiency at all? Certainly, it’s one explanation. But another is that the ‘bigger’ competitions tend to have hotter favourites and higher-priced underdogs; that is to say they have a greater variance across the three match outcome probabilities.
Trying to find out which it is highlights a fundamental problem with using RPS, indeed any scoring rule, to reveal what it has to say about the accuracy of probabilistic predictions.
What about epistemic uncertainty?
Suppose a model for a match predicts 45%, 27% and 28% for home, draw and away. Assuming the model to be correct, the expected RPS will be 0.225. Aleatory uncertainty due to random influences in the match mean that the actual score could be 0.191 (home win), 0.140 (draw) or 0.360 (away win), but if such matches are played an infinite number of times, the average RPS would be 0.225.
Now suppose the model predicts 70%, 20% and 10%. The variance across the three result probabilities is greater, as is the variance across the three possible RPSs (home win = 0.05, draw = 0.25, away win = 0.65) but the expected RPS is now lower: 0.150.
Whilst bigger markets might be more efficient than smaller ones, this could alternatively be explained by the presence of a greater proportion of hot favourites.
Provided both models are correct, the RPS will be lower where there is greater certainty for one or more of the particular outcomes. This provides an obvious reason why the average RPS for bigger markets is lower than for smaller markets. In ‘bigger’ markets, about 5% of odds in my sample held an implied win probability of greater than 70%. For the ‘smaller’ ones it was only 2%. Similarly, over 20% of odds in ‘bigger’ markets had win probabilities of less than 20%, compared to only 13% in the ‘smaller’ markets.
When you think of the dominant teams in ‘big’ competitions like Real Madrid, Barcelona, Juventus, Manchester City, Chelsea, Celtic, PSG and Bayern Munich during the sample period, this difference seems understandable. ‘Big’ competitions have more hot favourites, and by extension, more rank outsiders. Since odds show an asymmetry relative to their implied probabilities, the ‘big’ competitions have longer average odds.
But suppose now the second model in my thought experiment is completely wrong. Suppose instead the true probabilities are 60%, 25% and 15%. Now the expected RPS would rise to 0.190 because there are more away wins than the model believed there should be. However, the expected RPS is still lower than for matches the first model is predicting, implying a more accurate set of predictions, but we know this to be false. They only appear to be more accurate because of the greater variance in the three possible outcome probabilities for this sample of matches.
Systematic uncertainty (or error) in a model is known as epistemic uncertainty. The difficulty is this: when presented with samples of RPSs, how do we know how much systematic uncertainty is present? It is impossible to judge merely from the RPS figures themselves. The odds for the bigger soccer markets might appear to be more accurate (and efficient) by virtue of their lower average RPS, but we’ve seen that this could easily be an illusion. A lower average RPS does not necessarily imply a more accurate prediction model.
What have we learnt?
The Ranked Probability Score can be used to measure the accuracy of probabilistic predictions in a soccer betting market.
Using the RPS, it appears that closing odds are more efficient than opening ones and have become relatively more so over time. Furthermore, whilst bigger markets might be more efficient than smaller ones, this could alternatively be explained by the presence of a greater proportion of hot favourites.
However, underlying epistemic uncertainty in the models used to predict outcome probabilities imposes limitations on how we can translate these scores into an assessment of the accuracy of the underlying models.
Hence, we should be cautious when drawing conclusions about the efficiency of a soccer betting market from them. Fundamentally, the problem is that we will never know the true probabilities for soccer matches. I suppose if we did, we would be trillionaires.