Jun 5, 2018
Jun 5, 2018

Why betting on the World Cup is a data nightmare

The problem with a data-driven World Cup betting strategy

Why a qualitative approach is also limited

Is there a solution to the World Cup problem?

Why betting on the World Cup is a data nightmare

The World Cup is undoubtedly the biggest competition in world soccer. Happening once in a four-year cycle, fans will be looking forward to enjoying 64 matches over one month in Russia. However, the tournament can often leave bettors with a headache. What makes betting on the World Cup so difficult? Read on to find out.

In an earlier article, Mark Taylor explained his approach to trying to make accurate 2018 World Cup predictions (including the winner) while also pointing out the limitations of using such an approach. In this article I focus on the latter. Here I discuss the limitations of the quantitative and qualitative approaches. 

Your model’s results are determined by your process

Building any model is an iterative process that includes testing and monitoring results. This was discussed in a previous article in which an element of judgement and creativity was suggested as an important element in any model building processes.

Testing and monitoring results is constrained in a World Cup scenario as each tournament is held every four years. Quantitative approaches are not about getting the perfect real true results – it is a creative process whereby real-world outcomes are simulated or explained via numbers.

Yet any quantitative model is based on a series of assumptions and in turn past data. The problem with the World Cup is that the data, even that gathered during the qualification stage, is somewhat obsolete.

Let us consider a case in which we are using past team performance. The data from the qualification stages are not as important as teams would be facing teams from different strengths. For example, Panama might have pipped USA to pass the CONMEBOL round but would the team have passed in a UEFA’s qualification group?

The lack of data means that algorithmic-heavy approaches (of which there are quite a significant number out there) do not hold the same advantage as in normal week-in-week-out soccer leagues.

The intention is not to hint at European qualification being harder – it is just very different than a group stage effect. Moreover qualification matches occur over a period of two years whereby the team would have fluctuated from good and less good performances. Players might have fluctuated in their ability during the year, and some injuries seeped in.

One can opt to adjust for these fluctuations using FIFA team rankings but these are known to be very unrealistic. I would trust a simulation on FIFA 2018 would be a much better predictor.

Some advanced models try to use player specific parameters. These models tend to be very complex, with possibly better predictive results. However, player performance depends on the structure of a team, they may excel in the type of play set up by their club but not their national team.

The pressure on Messi to perform during the World Cup is only compounded by the lack of Barcelona teammates in the Argentina team. While Mohammed Salah has been absolutely magical this season, this is no proof for a similar performance with Egypt (although I personally hope this team will do well).

Using past World Cup data for team specific parameters (such as scoring intensity) would be disastrous for the same reason. Teams change significantly over four years, as we have seen some cases of cup finalists or holders doing terribly in the subsequent World Cup. The team coach and style of play is also likely to change over time.

Why a qualitative approach is also limited

Reminders of legendary teams such as Brazil of 1970, the Netherlands of 1974 (even though they didn’t win it) and Spain of 2010 also affects other approaches including qualitative predictions.

In an academic paper I co-published a few months ago (A Public (Mis)interpretation of Brazil’s World Cup Performance), we evaluated Brazil’s odds during the 2014 World Cup. To save you reading through the whole article, I can summarise the findings as Brazil’s post-match outright odds being higher than pre-match odds.

Teams change significantly over four years, as we have seen some cases of cup finalists or holders doing terribly in the subsequent World Cup.

The paper I co-authored found that while Brazil’s chances of winning the World Cup were 25% prior to the World Cup starting, they were at 18% just after having played Cameroon and passed the group stage. This was up to 27% at kick-off of their first knockout match.

With the benefit of hindsight, the Brazil we saw at the last World Cup was not the legendary team we have seen in previous tournaments. Yet bettors seemed to have been a victim of anchoring bias by putting too much emphasis on an initial impression.

This bias was being challenged each time Brazil played a match (hence the higher odds post matches) but quickly forgotten by the time the next match started.

There is another element that might have led to these inaccuracies. The technical term is over-confidence bias but let’s cut the chase and call it cockiness. There are a significant number of sports bettors, be it successful or unsuccessful ones, that are way too confident of their own abilities – myself probably included.

The fact that we’ve all overheard (and maybe been involved in) too many discussions explaining (with certainty of the speaker at the time) that “Leicester cannot win the league”, “Chelsea to be top four is a sure bet” and “Juventus will win the Champions League” amongst other such nuisances is real life evidence of this over-confidence bias.

Is there a solution to the World Cup problem?

If the quantitative approach is limited and the qualitative approach is biased, does that mean that there is no scientific way to get to provide adequate World Cup predictions?

The problem with the World Cup is that the data, even that gathered during the qualification stage, is somewhat obsolete.

No, it is probably an advantage. The lack of data means that algorithmic-heavy approaches (of which there are quite a significant number out there) do not hold the same advantage as in normal week-in-week-out soccer leagues. Moreover, the World Cup opens the doors for more recreational and emotional bettors.

The aim of any prediction is to be relatively and not precisely accurate. For example in an office prediction pool (feel free to use the free Excel file at Scoragol.com), I suggest to be a bit creative but not too much.

If you are aware that half the participants will put Germany as winners, then best not do that (that doesn’t mean placing Panama as winners). In trying to beat the market, consider different “what ifs”. Don’t just use one set of parameters for your output if you are using a qualitative model but test its sensitivity to a fluctuation in these.

Get the best 2018 World Cup odds and World Cup betting advice with Pinnacle.

Betting Resources - Empowering your betting

Pinnacle’s Betting Resources is one of the most comprehensive collections of expert betting advice anywhere online. Catering to all experience levels our aim is simply to empower bettors to become more knowledgeable.