One of the limitations with a Poisson model is the lack of predictive power when it comes to the probability of scoreless draws. This article explains how to adjust a Poisson model in order to deal with scoreless draws. Read on to find out more.
The staple model used to predict scores in soccer is the Poisson Model (or variants of it). The most straightforward approach is to set an expected goal parameter for each team and then predict scores accordingly.
To summarise the Poisson model, the home team parameter is the league average home scoring rate multiplied by an attacking factor based on the home team and a defensive factor based on the away team. The former adjusts the home scoring advantage to the visiting team defence ratings (stronger defence means fewer chances of goals) while the latter for the home team scoring capabilities. The away team’s expected goal scoring rate is evaluated in a similar fashion but using the away team’s scoring factors and home team’s defence factors.
The limitations of a Poisson model
Like any other model, there are some limitations when trying to predict the score in a soccer match with a Poisson model - namely that the outcomes are sensitive to changes in the parameters used.
The actual chance of getting a 0-0 draw may much higher for high goal-scoring teams as they might lower the tempo if the match remains goalless after significant time has passed.
The Poisson model also assumes that once expected goal parameters are set, the number of goals scored by each team is independent. Although this is somewhat controlled by using the specific defence and attack ratings, can we really expect the probability of the away team scoring five goals to be the same regardless of whether the home team scored five or nothing at all?
The most significant limitation is the assumption that the variance of goals scored per team is equal to the expected number of goals, a feature of the Poisson distribution. There are clever ways of dealing with this, such as over-dispersed (or under-dispersed) Poisson models and the bivariate Poisson model but discussing these is beyond the scope of this article.
One of the combined effects of these limitations is the lack of predictive power in assessing the 0-0 draw which can be higher or lower than the outcome from a Poisson model. My hunch is the Poisson model tends to understate the possibility of a 0-0 draw for teams with high expected goal parameters.
The actual chance of getting a 0-0 draw may much higher for high goal-scoring teams as they might lower the tempo if the match remains goalless after significant time has passed. Conversely, low scoring teams may have a higher tempo until the first goal is scored. The standard Poisson model would not capture this and hence over-predict the chance of a 0-0 draw. That said, this is just a hunch not based on any test – if anyone is willing to test it and contact me I’d be happy to hear from you.
How to deflate or inflate the probability of a draw
One approach for adjusting probabilities of 0-0 draws is to inflate or deflate the probabilities of such a draw and adjust other predictions accordingly. This can be explained as a five-step process, which is explained here using a simple example:
Step 1: Calculate the per-team expected goal parameters
This is probably the process that takes most of your time unless you have automated the process. Benjamin Cronin explains it excellently in his Poisson distribution article. For the sake of brevity, we are assuming that the final average goal parameters are 1.7 and 1.2 for the home and away team respectively (these are just random figures).
Step 2: Calculate the probability for the number of goals scored per team
This can be calculated using a formula and a worked example is also provided in the link above. In this case we are using the probability distribution for the number of goals scored using the formula is as follows:
Step 3: Calculate probability distribution for scorelines
We can now multiply probabilities for the different scorelines. For example, a 0-0 scoreline is 18.3% x 30.1% = 5.5% likelihood. The results would be as shown below. Do note that these do not add up to 100% due to the possibility of other scores (for example 5-1). We can add that the probability of other scores is 3.7%.
Calculating probability distribution for scorelines
Step 4: Calculating inflation/deflation parameter for 0-0 draw
This is where some subjectivity may seep in. For example, let us assume that past statistics seem to imply that a 0-0 should have a probability of 10%. Hence we would need to increase the 5.5% to 10%.
The inflation parameter can be calculated as:
(presumed probability of a 0-0)/(predicted probability)=(presumed prob)/(prob(0,0))
Representing this using the symbol α, we get that:
This effectively means that we are increasing the probability of a goalless draw by 82%. As this increased from 5.5% to 10%, the other probabilities must decrease their cumulative probability by the same amount so that the total of all outcomes is 100%.
Step 4: Calculating inflation/deflation parameter for the other scores
Using the symbol β for this factor, we can use the equation:
β=(1-α[prob(0,0)])/(1-[prob(0,0)])=(1-presumed prob)/(1-predicted prob)
In this case, we get β=(1-0.1)/(1-0.055)=0.95
Step 5: Repopulate the inflated scoreline table
We can now finally recalculate the probabilities of different scores by multiplying the 0-0 probability by α and the rest by β. We would obtain the following results and the probability of other scores being 3.5%.
Repopulating the inflated scoreline
What have we learnt about adjusting a Poisson model?
Throughout this article we discussed an adjustment to the traditional Poisson model that alters the probability of a scoreless draw. This model can be extended to deal with adjusting any scoreline as long as the probabilities of all outcomes are also adjusted to add up to 100%.
This is not the only approach to changing probabilities of some outcomes. For example, Dr Alun Owen explained a possibly better approach during the MathSport conference last June that involved a truncated Poisson model.
The adjustment does not minimise the limitations of Poisson models, some of which are discussed earlier. Indeed, it adds other assumptions – the assumed probability of a scoreless draw and that all other probabilities are adjusted with the same rate β. Notwithstanding it can be a good improvement over traditional models that tend to under/over-estimate goalless draws.