Jan 31, 2018
Jan 31, 2018

# Inflating or deflating the chance of a draw in soccer

## How to deflate or inflate the probability of a draw One of the limitations with a Poisson model is the lack of predictive power when it comes to the probability of scoreless draws. This article explains how to adjust a Poisson model in order to deal with scoreless draws. Read on to find out more.

The staple model used to predict scores in soccer is the Poisson Model (or variants of it). The most straightforward approach is to set an expected goal parameter for each team and then predict scores accordingly.

To summarise the Poisson model, the home team parameter is the league average home scoring rate multiplied by an attacking factor based on the home team and a defensive factor based on the away team. The former adjusts the home scoring advantage to the visiting team defence ratings (stronger defence means fewer chances of goals) while the latter for the home team scoring capabilities. The away team’s expected goal scoring rate is evaluated in a similar fashion but using the away team’s scoring factors and home team’s defence factors.

### The limitations of a Poisson model

Like any other model, there are some limitations when trying to predict the score in a soccer match with a Poisson model - namely that the outcomes are sensitive to changes in the parameters used.

The actual chance of getting a 0-0 draw may much higher for high goal-scoring teams as they might lower the tempo if the match remains goalless after significant time has passed.

The Poisson model also assumes that once expected goal parameters are set, the number of goals scored by each team is independent. Although this is somewhat controlled by using the specific defence and attack ratings, can we really expect the probability of the away team scoring five goals to be the same regardless of whether the home team scored five or nothing at all?

The most significant limitation is the assumption that the variance of goals scored per team is equal to the expected number of goals, a feature of the Poisson distribution. There are clever ways of dealing with this, such as over-dispersed (or under-dispersed) Poisson models and the bivariate Poisson model but discussing these is beyond the scope of this article.

One of the combined effects of these limitations is the lack of predictive power in assessing the 0-0 draw which can be higher or lower than the outcome from a Poisson model. My hunch is the Poisson model tends to understate the possibility of a 0-0 draw for teams with high expected goal parameters.

The actual chance of getting a 0-0 draw may much higher for high goal-scoring teams as they might lower the tempo if the match remains goalless after significant time has passed. Conversely, low scoring teams may have a higher tempo until the first goal is scored. The standard Poisson model would not capture this and hence over-predict the chance of a 0-0 draw. That said, this is just a hunch not based on any test – if anyone is willing to test it and contact me I’d be happy to hear from you.

### How to deflate or inflate the probability of a draw

One approach for adjusting probabilities of 0-0 draws is to inflate or deflate the probabilities of such a draw and adjust other predictions accordingly. This can be explained as a five-step process, which is explained here using a simple example:

Step 1: Calculate the per-team expected goal parameters

This is probably the process that takes most of your time unless you have automated the process. Benjamin Cronin explains it excellently in his Poisson distribution article. For the sake of brevity, we are assuming that the final average goal parameters are 1.7 and 1.2 for the home and away team respectively (these are just random figures).

Step 2: Calculate the probability for the number of goals scored per team

This can be calculated using a formula and a worked example is also provided in the link above. In this case we are using the probability distribution for the number of goals scored using the formula is as follows:

## Probability distribution for the number of goals in a soccer match

 - - Probability for number of goals Team Expected Goal Parameter 0 1 2 3 4 Home 1.7 18.30% 31.10% 26.40% 15.00% 6.40% Away 1.2 30.10% 36.10% 21.70% 8.70% 2.60%

Step 3: Calculate probability distribution for scorelines

We can now multiply probabilities for the different scorelines. For example, a 0-0 scoreline is 18.3% x 30.1% = 5.5% likelihood. The results would be as shown below. Do note that these do not add up to 100% due to the possibility of other scores (for example 5-1). We can add that the probability of other scores is 3.7%.

## Calculating probability distribution for scorelines

 - - Home Team Goals - - - - - 0 1 2 3 4 Away team Goals 0 5.50% 9.40% 8.00% 4.50% 1.90% - 1 6.60% 11.20% 9.50% 5.40% 2.30% - 2 4.00% 6.70% 5.70% 3.20% 1.40% - 3 1.60% 2.70% 2.30% 1.30% 0.60% - 4 0.50% 0.80% 0.70% 0.40% 0.20%

Step 4: Calculating inflation/deflation parameter for 0-0 draw

This is where some subjectivity may seep in. For example, let us assume that past statistics seem to imply that a 0-0 should have a probability of 10%. Hence we would need to increase the 5.5% to 10%.

The inflation parameter can be calculated as:

(presumed probability of a 0-0)/(predicted probability)=(presumed prob)/(prob(0,0))

Representing this using the symbol α, we get that:

α=10/5.5=1.82.

This effectively means that we are increasing the probability of a goalless draw by 82%. As this increased from 5.5% to 10%, the other probabilities must decrease their cumulative probability by the same amount so that the total of all outcomes is 100%.

Step 4: Calculating inflation/deflation parameter for the other scores

Using the symbol β for this factor, we can use the equation:

β=(1-α[prob(0,0)])/(1-[prob(0,0)])=(1-presumed prob)/(1-predicted prob)

In this case, we get β=(1-0.1)/(1-0.055)=0.95

Step 5: Repopulate the inflated scoreline table

We can now finally recalculate the probabilities of different scores by multiplying the 0-0 probability by α and the rest by β. We would obtain the following results and the probability of other scores being 3.5%.

## Repopulating the inflated scoreline

 - - Home Team Goals - - - - - 0 1 2 3 4 Away team Goals 0 10.00% 8.90% 7.60% 4.30% 1.80% - 1 6.30% 10.70% 9.10% 5.10% 2.20% - 2 3.80% 6.40% 5.50% 3.10% 1.30% - 3 1.50% 2.60% 2.20% 1.20% 0.50% - 4 0.50% 0.80% 0.70% 0.40% 0.20%