close
Today
Aug 18, 2017

How to make predictions with no data

Our idea of the future is based on the past

How many piano tuners are there in Chicago?

Superforecasters & the Good Judgement Project

10 Commandments of good prediction

How to make predictions with no data

The Mayweather vs. McGregor UFC-Boxing crossover has sports fans and bettors gripped for the simple reason that it has no precedent. This provides fertile grounds for debate and speculation but is a huge headache for anyone trying to predict the outcome with any certainty, as predictions rely on the past informing our idea of the future. So what can bettors do in these unique circumstances? They can turn to an approach developed by a Nobel Prize winning nuclear physicist.

Prediction is a quantitative science. The more historical data you have the better the chances of generating an accurate picture - via a system or model - of what the future might look like. 

When someone builds on existing knowledge, identifying a new influence on a given event from within a data set, they are in a position to produce more accurate predictions than existing ideas. Good examples are weather forecasting or Outright Premier League betting.

When there is no data however, you are reduced to qualitative analysis - building reasoned arguments about what might happen. This may seem no better than wetting your proverbial finger and sticking it in the air, but science has helped in the form of the Fermi method.

Enrico Fermi was a celebrated physicist, regarded as the architect of the nuclear age. He won the Nobel Prize in Physics in 1938 and was the creator of the world’s first nuclear reactor. He is also remembered for his approach to making quick guesstimates to quantify something that is regarded as being impossible to reasonably compute given the limited information available.

When there is no data however, you are reduced to qualitative analysis - building reasoned arguments about what might happen. This may seem no better than wetting your proverbial finger and sticking it in the air, but science has helped in the form of the Fermi method.

When teaching this approach he was known to challenge his students with questions like the following:

How many piano tuners are there in Chicago?

This isn’t a trick question. Take a few minutes to think about the question and jot down a reductive argument - based on a series of sub-questions with estimates - that would enable you to arrive at a reasonable answer to that main question (without resorting to Google). Do that before reading on.

If you asked yourself the following sub-questions (or followed a similar logic but with slightly different questions) you would arrive at a good idea of the answer.

  • How many pianos are there in Chicago?
  • How often is a piano tuned each year?
  • How long does it take to tune a piano?
  • How many hours a year does the average piano tuner work?

Using your guesses for the first three questions you can calculate how many piano-tuning work hours there might be in Chicago annually and divide it by the number of hours a tuner might work in a year to arrive at a reasonable guess of how many pianos that would support. Of course to inform questions one, two and three you have to break them down into further sub-questions.

So for question one you would need to guess Chicago’s population using any knowledge of other US cities’ populations. You might guess somewhere between 2-2.5 million (it is actually listed as 2.7 million in 2016).

You then need to work out what percentage of people own a piano, which by rule of thumb could be one for every 100 (around 25,000 if you use the upper guess for population). Then throw in a value for bars, clubs, schools etc. so you might double the penetration to say two in every 100 or 50,000.

Questions two and three are simple intuition, unless of course you have domain knowledge. A piano is tuned around once a year and takes roughly two hours. Question four could be based on your own experience or an average full-time job working five days a week with standard holidays.

So if you guess that there are 50,000 pianos needing tuning once a year, taking two hours each to tune, that represents 100,000 tuning hours. Divide that by the 1,600 hours worked on average per year by a tuner and you would arrive at 62.5 piano tuners in Chicago.

There is no definitive answer, though analysis of yellow pages (courtesy of Daniel Levitin) came up with 83, including duplicates, so if you got somewhere between 55 and 70 you are doing well.

Don’t stress too much about the accuracy of the answer as much as the approach you took. This kind of a mindset is conducive to accurate forecasting in the absence of data - similar to Mayweather vs. McGregor betting. If you didn’t quite understand how to approach this question, read the rest of the article then try again with another abstract question.

The piano tuner question has been used by Google, for example, as an interview question - helping to establish reasoning skill - along with similar questions like ‘How much does the Empire State Building weigh?’

Bookmakers aren’t in the business of making predictions; they simply offer a measurement of how likely something is to happen, represented in the form of odds. In that respect, Pinnacle is on safer ground with established sports that follow a fixed set of rules and have good solid and accessible historical data.

Superforecasters - The Good Judgement Project

The Fermi method is discussed in Super-Forecasting: The art & science of prediction, an excellent book by Philip Tetlock and Dan Gardner. The book takes a look at the development of the science of prediction, using the Good Judgement Project as a backdrop.

Over four years Tetlock had invited “20,000 intellectually curious lay people” to join the GJP and forecast the outcomes of a wide variety of geopolitical conundrums. His team were part of a wider initiative by IARPA (Intelligence Advanced Research Projects Activity) - an agency within the US intelligence community focused on improving the standard of their predictions around critical political and economic events that directly impact national interest.

IARPA created a prediction tournament featuring five teams led by leading scientists within the field, including the GJP and its amateur sleuths. Over five years IARPA raised close to 500 questions, with answers submitted at the same time every day.

Accuracy was measured using Brier Scores - scoring predictions by summing the delta of the strength of certainty of predictions and actual outcome (squared). Requiring forecasts to be made with a confidence factor rewards and punishes confidence in equal measure and is an excellent way to properly distinguish tipsters.

Bookmakers aren’t in the business of making predictions; they simply offer a measurement of how likely something is to happen, represented in the form of odds. In that respect, Pinnacle is on safer ground with established sports that follow a fixed set of rules and have good solid and accessible historical data.

We can then build models - based on what we know has happened - and arrive at a good idea of the chances of future outcomes, represented in the form of opening odds.

However, to reach new customers and broaden our appeal to existing users we need to go beyond that core offering which takes us into the realms of new or fringe sports where there may be a fragmented record of previous matches - the best example being eSports, Specials and elections.

Elections happen so infrequently and in such different circumstances that historical data provides very little value. Polls are unreliable for a multitude of reasons, while reliance on news is a minefield meaning bookmakers often fail politics betting.

The Maiden Conundrum

Another great example of this problem is Maiden horse races (note Pinnacle does not offer horse racing but this is useful for reference).

Two-year-old Maiden races, involving first-time out horses (alongside those yet to win) are a great example of a betting event mostly devoid of solid form. Worse still, races are short with little margin for error should horses get in trouble as they step on a racecourse for the first time.

How do you predict the talent of a horse that has never run, in a race where success (from the trainer/owner perspective) may be measured in terms of simply giving their runner a positive experience of racing?

  • You ask a set of deductive questions that follow the Piano Tuner Problem.
  • How good is the horse’s breeding? How successful is the breeder?
  • What about the trainer? Their record with first-time out winners, and over the same course distance?
  • What about the jockey’s record in Maidens?

These questions can enable you to make a reasoned guess of the horse’s chances - ideally by combining them into a rating - and by using the Brier Score to accurately score your confidence level against the market.

In this way, these kind of seemingly intractable problems represent an opportunity for bettors as bookmakers are in the same boat - albeit with an all-important margin. We have no model or maths to rely on so our Traders will rely on their experience and knowledge and a Fermi type approach.

The 10 commandments for good prediction

The challenges that the GJP faced are no different to those faced by bettors and bookmakers when they move away from traditional sports markets into the realms of exotic bets which brings us back to Mayweather vs. McGregor. We have a reasonable idea about handicapping Boxing and MMA, but a boxer vs. a mixed martial artist essentially opens up a Fermi type problem (bettors can try and solve using the live odds and odds movement chart below).

The good news here is, based on the findings of the GJP, there are some very practical things that were experimentally proven to raise the base level of predictions of these amateur forecasters.

Tetlock has actually distilled 10 commandments of good prediction based on the experience of the GJP. More detail can be found at www.goodjudgementproject.com - better still read the book - but here they are (adapted in short-form) and applied to betting and where applicable, the Mayweather vs. McGregor question.

Using randomised trials Tetlock established that those reading a guidebook containing these tenets increased their Brier Score by 10%. That could be enough to move you into long-term profitability as a bettor.

1 - Focus on problems where your hard work is more likely to pay off, ignoring both the obvious and the intractable. There is little chance of you discovering something about the Premier League that isn’t already accounted for by the market. Find a realistic level - a Goldilocks Zone - where it is realistic to assume value can be found with reasonable time and effort.

2 - Break big chunky problems into a series of smaller ones. For example ‘Who will win between Mayweather and McGregor?’ Might be become “What boxing form does McGregor have?” “What are their respective motivations?” “What style does McGregor fight, and what is Mayweather’s success-rate against that style?” and so on. Assign values and a degree of confidence to your answers.

3 - Balance inside and outside views. In respect of Mayweather-McGregor that requires you to step outside of a boxing or MMA only assessment. McGregor has a huge following within the MMA community, who are no doubt backing him in large numbers, but is their inside view valuable here? Equally, how much do boxing purists know about MMA? Trying balancing both views.

4 - Balance over/under compensation for new information. This basically encourages a Bayesian approach of incorporating new evidence, but equally cautions against over-reacting to new information, as much as sticking to your guns. This relies on experience and weighing the value of the source of information. A huge amount will be said online about this fight, so spend time figuring the best sources of information.

5 - Challenge your prejudices. If you know your boxing and can’t see past a Mayweather win, challenge yourself to think of any scenarios in which he might lose, and vice-versa.

6 - Translate hunches into degrees of probability. An experienced forecaster will have a wider language than ‘Mayweather is a certainty’ or ‘McGregor has no chance.’ Their approach will reflect a more nuanced assessment measured in probability, not rhetoric. 

7 - Learn to balance over/under confidence. This means striking a balance between procrastination to the point of inaction - and missing an opportunity, going all-in without making a measured assessment.

8 - Analyse both failures and successes with the same rigour. What is worse than being wrong, is not taking ownership for where the mistake was. Equally, you can make the right decisions and still get the wrong result and vise versa.

 9 - Bring out the best in others and let others bring out the best in you. This relates to the Team nature of the GJP, so will only be relevant if you are working as part of a syndicate or perhaps if you are very active on Social Media and willing to share your work and accept/give constructive criticism.

10 - Improvement only comes from putting your good intentions into practice. It is fine to see betting as recreation, just don’t expect to win in the long run. If you aren’t happy with that prognosis accept that you will have to commit time and effort to betting in a systematic and structured way.

BR Home
See the latest Mayweather vs McGregor odds
Betting Resources is Mirio's brainchild. He joined Pinnacle over 10 years ago as a Copywriter and since then has made building the content presence his mission. Along the way he has assumed responsibility for Social, SEO and CRM but Betting Resources is his baby and he stills finds time to contribute the odd article, usually around behavioural psychology and how it relates to betting. Fantasy dinner-party guests would include Daniel Kahneman, Nicholas Nasseem Taleb and Edgar Allen Poe.
Betting Resources - Empowering your betting

Pinnacle’s Betting Resources is one of the most comprehensive collections of expert betting advice anywhere online. Catering to all experience levels our aim is simply to empower bettors to become more knowledgeable.