Apr 9, 2020
Apr 9, 2020

A guide to modelling in sports betting - Pinnacle Betting Podcast

The Pinnacle Podcast in a written format

Learn about modelling in betting

Inform your betting knowledge

Improve your betting skills

A guide to modelling in sports betting - Pinnacle Betting Podcast

"Statistical Sports Models in Excel" author Andrew Mack shares unique insights into how statistical analysis can be applied to sports betting, as well as sharing insight into the mental rigours of gambling and how to build a betting model that you can use yourself. Read on to learn all about modelling in betting.

Ben:

Hello and welcome to the Pinnacle Betting Podcast brought to you by Pinnacle.com. The online bookmaker that bring you the best odds, highest limits, and unique winners welcome policy. Today's episode is all about predictive modelling and how it can be applied to sports betting. So, it's a good thing I'm joined by a man who's authored a book called Statistical Sports Models in Excel. A very warm welcome to Andrew Mack.

Andrew Mack:

Hi, man.

Ben:

How are you Andrew? You all right?

Andrew Mack:

I'm doing well and yourself?

Ben:

Yeah, I'm very well, excited about this one. So, what we'll do is obviously as I said, we're going to be talking about modelling today. I think just to get started we can maybe take a step back a bit and you can give us an intro into who you are and what you did before and how you got into betting.

Andrew Mack:

Sure. Yeah. Well my name is Andrew Mack. I'm currently a third-year law school student as well as pursuing an online master of data science through James Cook University and when I'm not doing either of those things, I'm modelling a number of different sports and betting on them. And for me, this all started a long time ago. Prior to law school I was a journeyman electrician and my first introduction to gambling really was sometime around 2007, there was a bit of a poker boom going on in Alberta with no limit ‘Hold Em’ and online poker sites being very, very popular as they were in many places around the world. I had a number of good friends of mine that were professional poker players and I played quite a bit of poker back then.

I wasn't terrific at it I would say, but made a little bit of money and in cash games at physical casinos in Alberta and also a little bit of money in full tilt poker before it got shut down. And I would say that just before I get to my real entry into sports betting, I should probably introduce a little bit about the mathematical foundation because I think that that's important too. I have an undergrad in social science and did take a few introductory statistics courses back then, but I'd be lying if I said I paid a tremendous amount of attention to it at the time it actually didn't seem super interesting.

And I think maybe that was because most of the data that we were concerned with working with was census data. It was disease per population and things like that and at the time, I didn't find it overly interesting, but I did take a few of those courses. And with my electrical trade school background, most of the electrical trade math is algebra, basic calculus, trigonometry, circuit calculations, power triangles, three-phase electricity troubleshooting and problem solving, things like that. And I think that having that really helped even though at the time I wasn't aware of it. So, in 2011, somewhere around there, I picked up an eBook called a Do It Yourself Sports Betting systems. Not an overly technical book, but the basic premise was to hunt for undervalued dogs.

You were basically, according to the author, you were looking for a 500 team with a 500 or better record in the home or away situation with positive dog odds. So, 2.0 decimal odds are better. So basically, we're trying to find fifty 50-coin flips that paid out better than one-on-one. And that was actually my very, very first introduction to the idea that there could be some system involved in betting on sports. Now being a complete novice at the time, I mostly ignored the author's advice and took a shot at picking my own winners, specifically in the NHL, which is where I started mostly on the puck line and the Money Line. But really, I was just betting based on my personal opinions of the games.

Hockey was a sport that I've watched most of my life and was quite interested in and I would have strong opinions about various teams. Oh, this player is hot right now, these guys can't win a game, et cetera. And didn't really have too much interest in other sports at the time. Now, just betting based on my own personal opinions went about as well as you would imagine - I lost considerably in the beginning probably in the first two years, well I did win sometimes, the overall trend was losing money for sure and probably was close to about $10,000 over two years of losses. And it started to sink in that there's more to this than just picking who you think is likely to win and there's got to be more to this than just your own gut shot opinion.

And I had a bit of an epiphany when I discovered Bill James’ Pythagorean expectation and the reason that that clicked for me was because, well, it's not exactly the same as Pythagorean theorem. Pythagoras theorem is something that I used a lot of in electrical trade school because it's used to calculate variable power reactive power in power triangles. And so, it clicked for me, I realized, oh- there's a mathematical basis to this. I'm going to probably need to use math to solve this problem just like you do with electrical problems and that really is what led me down the rabbit hole of statistical analysis and eventually model making.

Ben:

So, you've got … There seems there's a lot of loose ends that were going on here. You said, that's statistical background, the work that you did, obviously a bit of relative success to do with poker and interest in sports and everything like that and was it that initial change that you said, right, I'm going to switch up the approach and I'm going to start to take this seriously. Was it easy to stick to that or did you find yourself falling back into those gut picks or trying to trust your knowledge over what a model said or what the data said?

Andrew Mack:

That's a great question. I would say that I made most mistakes that I think a lot of beginners make. So definitely, it was not a linear path to success. I certainly didn't have this realization and then never strayed from it. Definitely, I had a few setbacks. Ultimately when you're a beginner, you just, you have, I think what's been turned unconscious incompetence, right. You don't know what you don't know. And so, you think, okay, well these gut shot picks aren't doing very well, but this, what about this trend stuff? What about this team's record against the spread, for example, or something or their last five games or the last 10 games?

I definitely looked at every different Avenue and as a result fell into some of the traps that I think a lot of people just starting to fall into in terms of not being able to properly identify, okay, this is probably not helpful and here's why. Tried a bunch of different things and many of them didn't work. And unfortunately, with sports betting you largely have to build your own toolbox or arsenal yourself and part of that is taking a few hits along the way to learn some things the hard way, and certainly that was the case for me.

Ben:

Yeah, it's probably quite easy for you to look back now and think, God, I was square back then and I didn't know what I was doing. But at that point in time you said, you were losing a bit of money and stuff like that. Did you know you were quote un-quote “square” as a better or did you just think, I'm having a run of rotten luck here. It's going to turn around. What was your mindset? Did you think you were going to win money and you were doing it the right way?

Andrew Mack:

From that time, I recall being really frustrated when I didn't win. I actually had an expectation of winning, but I didn't have a reasonable basis for that and I don't know that I was even aware of the sharp square, categorization or dichotomy at that point. I was really just thinking, why isn't this working? There's got to be a way to do this. I just haven't found it yet. And I tried to keep a positive attitude but ultimately, like I said, I just didn't know what I didn't know. I wasn't … There were so many elements to sports betting, whether it's staking size, [or] whether it's the proper statistical techniques that can help you actually make a reasonably accurate forecast.

[There are] So many different things that I just wasn't aware of and I think that in many ways, I'm grateful for those mistakes because they led me to where I am now. But it was definitely a rocky road in the beginning there for sure.

Ben:

And say obviously for me, I'm putting these things together and you've got the previous work and the sports fan getting serious about banging and then this law school thing then comes along as well. Is that just something that at the time were you thinking career wise, betting you were getting serious but not really seeing it as an avenue to earn a living and the law school kind of thing was just an interest in terms of career development?

Andrew Mack:

So, law school was an interesting story for me. I really enjoyed being an electrician. You really have to get a sense of satisfaction from building something and doing it well and using math to solve problems. I really enjoyed the work. What was becoming an issue for me was that my body was breaking down. So, the injuries were piling up and I'm not that old, at the time I think I was 32 and there's the stories that I could tell about some of those things would curl your hair, straighten it back out again. I'd finished, my last day of the week would be Friday. On Saturday morning, I used to take about a thousand milligrams of ibuprofen so that I could close my hands enough to hold a cup of coffee.

You wake up on the weekends and your knees hurt and your elbows hurt and your hands are swollen. And like I said, some of the injuries were piling up. I broke my foot. I broke my big toe, I tore my right rotator cuff. So, some of those things, being a bit of a bigger guy, six foot three 260 pounds, you're asked on a job site to do a lot of the really, really heavy lifting. If there's a piece of equipment that nobody else can move, that's something that you're going to be tasked with doing. And I was happy to do that, but those kinds of things started to take a toll and I started thinking to myself, if this is what this is like at 32 with the injuries and the relative chronic pain that I was having.

What would it look like at 42 or 52 or 62, and I realized that the odds were pretty good that, I wasn't going to be able to do a whole lot physically 10, 20 years down the road. So, if I was thinking about different options, maybe I should look at that now. That's when I really looked into law school. I studied for the LSAT when I got home from working electrical and really got admitted to law school on the strength of my LSAT. I had a very, very strong LSAT and as a result was admitted and my plan really was … Two things were  at the forefront of my mind. The first one was worst case scenario, I come out the other side of this with a law degree. I can practice law and that's something that I can do without racking up all of these injuries for the next 20, 30 years.

But the other idea that was in the back of my mind was law school will also give you some time to think. It will give you some time to refine the models that you've been working on, and if that goes well there is a good possibility that you can take a serious shot at betting [on] sports professionally. So, those were the two driving factors for me.

Ben:

So, you put your body under enough strengths, then you decided to put your mind under some strain with the perils of law, schooling and betting?

Andrew Mack:

That's right. Yeah.

In a market proven to be highly efficient how realistic is this?

Ben:

I think we've got some really good insight into kind of where you've come from in your career and in general as well in terms of betting. So, let's talk about more your day to day life now. So, you obviously keep yourself very busy with everything that's going on at the moment. Has that always been the case? Is that part of your makeup as a person?

Andrew Mack:

I don't think, actually. I think that this is just, I'm in a unique position here where law school is going was well and I found a way to structure my classes spread out throughout the course of the day in such a way that I'm able to squeeze a little bit more out of the time that I have, and where I'm at with my modelling  I'm also able to squeeze in the masters as well. It's pretty tight, there are some days where I might skip a class in order to finish an assignment for the masters or vice versa. But I'm able to make it all work. I would say that I'm definitely running pretty close to maxed out and this is definitely not … I don't think that it's ideal.

I think that sports betting generally would be better for me if I had a little bit less on my plate and for that reason, I'm looking forward to finish with law school so that I can really just focus on it a lot more.

Ben:

When it comes to the balance of law school and betting and the statistical side of things, do they complement each other in terms of a skill set or is it almost a hindrance that you've got to shift from this mindset of law brain to then this data-driven betting brain? How does that work?

Andrew Mack:

Yeah, that's a really good question. I would say that, there's two elements to that. Two sides to that coin. The first part is that there are elements about law school that do complement. And I would say that with respect to the law, the understanding the learning of law, it's an inherently probabilistic discipline. Even though it's not science-based, you might consider it the art of probabilistic thinking rather than the science, more mathematics of it. It really forces you to question certainty because every case precedent, every fact pattern, every legal argument there is a counterpoint or a counter argument or a case precedent that disagrees and so nothing is 100% certain.

There are no 100% winning cases or 0% winning cases. It's always a question of stronger or weaker arguments or more likely or less likely to succeed in any given legal context. And so, I think that type of thinking generally is helpful for sports betting because it helps to prevent against overconfidence. Where you realize that even when you have an exceedingly strong case or a very strong argument, there are counterpoints to consider and you really shouldn't think of anything as a quote or quote lock. And that's as true in laws as it is in sports betting even when most of the law is on your side or in the case of betting, most of the probability is on your side. An 80% chance of winning is not a lock. It's 88%. Right.

So, I would say that element is very helpful and definitely complimentary as you said. The part that's difficult, at least for me is that legal argumentation is in many ways the reverse process of the scientific process because you actually start with the outcome that you want and then you work with the case law and the arguments and the fact pattern that's available to lead to your foregone conclusion. And so, you worked from the end goal back through the facts, which is the opposite of what we're trying to do with most statistical analysis. We don't want to have a preconceived notion of what's going to happen.

We need to test things and we need to question assumptions and things of that nature. And so, I think that sometimes for me that's a lot to keep in your mind simultaneously. The two very, very different modes of thinking. But that being said, I think that learning to think critically and analytically, whether you're using it for law or sports betting is a very useful skill.

Ben:

And then when do you have time to put into the betting side of things? What's the process that you're going through? Is it, how much time are we waiting to the building of models or the development of models versus the actual act of running those models, finding the bets and placing the bets?

Andrew Mack:

I would say that it's very heavy on the creating and testing and finding and once you get them up and running there's not a whole lot to it other than, you then become much more involved in monitoring, changing game conditions like a lineup changes or injuries and market price movements and a little bit of calculation with regard to staking size, so if you want to use fractional Kelly on multiple simultaneous bets, there's some additional calculations to be done there. But other than that, I would say that running the models is relatively light on the workload compared to the building of the model.

And I think that's pretty straightforward to most listeners because most model ideas just don't work out. So, you need to have a lot of them and you need to keep trying things and that's time-intensive. I wouldn't say that it's completely a brute force type of thing, but in many ways, it can feel like that when you've been working on a model for 40 hours and it turns out that it's not very good, it can feel a little bit of a brute force endeavour. But I would say that yeah, if you almost wanted to consider it like a Pareto Principle, I would say 80% is on the building, the conceptualization, the testing and 20% would be running them and updating them and making sure that everything is running smoothly.

Ben:

And then you said that there's so much that goes into whether it's lineup changes or the odds simply moving in the market moving and stuff like that. If you're also limited for time are you manually betting or is it an automated process for you through APIs and things like that?

Andrew Mack:

I would like to eventually move to automation through APIs. I know Pinnacle has a great, a great API  package for doing things like that, but I haven't done that just yet because I want to personally inspect and approve every bet that my model would suggest betting on before it actually goes through. I don't … That's just me though. I don't know that's necessarily better to do it that way. I definitely can tell you that, once I have more free time and law school is finished, I do plan on transitioning to more of a full automation just to lighten some of the workload that I have for myself. Some things I have are automated now, which is nice, but definitely I need to move more in that direction to become more efficient with my time.

And to speak back to your point about time limitations the net effect of that at this point is that basically there are some days where I'm just not able to bet because I'm not able to actually take a look at all the necessary factors and place the bet at the appropriate time. And so, there's a couple of days a week where I might just not be able to bet that day. Because of my other time commitments. So, I try to get it in a few days of the week and the weekends at the moment.

Ben:

Right. We want to get on to modelling in your book very shortly. As I said, a couple more questions just for me to get to know you and your betting as it is at the moment. Are you, obviously it's modelled, but are you betting on specific markets or specific sports at the moment?

Andrew Mack:

Yes. As of right now, I'm betting predominantly on the NHL as it just opened up recently. So, NHL Money Line totals, occasionally the puck line, some NHL props including shots, saves goals, points. I am eagerly anticipating the start of the regular NBA season because that will be a very high-volume market for me and in that market, I’ll be looking at quarters, halves and full game points, spreads and totals for props if it will be player points, points rebounds assist, one prop, turnovers, three points made blocks and steels. And I think a little bit of CFL and I think that's most of it right now as of this period in the year.

With regards to other seasons in the year, other things that I've done or looked at.. MLB was a tough season for me, mostly because I've had some trouble forecasting the bullpen and so, while I continue to try and work out the kinks on the full game models, most of my betting for the MLB that's been successful l has been prop bet. So, strikes hits, runs errors as a prop home runs, runs in the first inning and a little bit of a first five innings because usually the starting pitchers that are still playing at that point, and so I've had a little bit more success with that. But really over the course of the year, anything that I think I might have an edge on, I will take a look at, I do some NFL props as well, pass attempts, passing yards, receiving yards, rushing yards, touchdowns, a little bit of small league Euro basketball as has been mentioned in the book.

And I think our previous Pinnacle Podcast chat, some AFL, if I have time and I'm also interested in the English Premier League exact score markets. I occasionally find a little bit of value there as well and I think that's most of what I'm up to these days.

Ben:

So the leagues themselves what many people know to be very efficient leagues are obviously major betting leagues for a lot of bookmakers. And it seems that you then digging around into the markets within those that perhaps might be a little less efficient, but have you scaled up with your modelling and maybe I don't know, gone from KHL or Euro league basketball and moved up through those levels to where you're now [at] a point, you need those high limits to get your action down or is it just you're dedicating time to those markets and that's why you're betting on them?

Andrew Mack:

A little bit of both. I would think that, well, I suppose I should say that as your models get more sophisticated and more nuanced, you can progressively work your way up in more and more efficient markets. It's certainly not the place that you should start, which is a point that I continue to repeat over and over again on Twitter and in the book. But you can definitely get there provided that your models have the required level of sophistication. And you made a really good point with regards to these sharp markets, not every sub-market or derivative market inside. What we would consider largely to be a shift market is at the same level of efficiency. And so, as a result of that, there are frequently opportunities that present themselves. Props are a really great example of that.

And with regards to the other things, like, definitely started out with smaller basketball leagues like the Icelandic women's basketball league and some other, European basketball leagues. And then worked my way up with the hockey model. I never actually focused much on the KHL. I really just started with the NHL and eventually progressed it to the point where it's showing good value in producing good results even though the market is quite sharp. So, slightly different approaches for each market. Although I would say that basketball and baseball are very sharp and require quite a serious commitment to your model's sophistication.

Ben:

Well, I've heard a lot of talk about modelling. We’ve danced around the subject. Let's get into it and discuss your book as well and a little bit more detailed. So, you published Statistical Sports Models in Excel earlier this year and for anyone that hasn't read the book, could you just maybe give us a brief intro into it and tell us what it's about and why people should read it?

Andrew Mack:

Sure. Yes. Statistical Sports Models in Excel is essentially a book that I wish existed when I first got into modelling. It's basically designed to be a crash course in foundational statistical modelling techniques for the explicit purpose of sports betting with the heavy technical language and statistical formulas removed to make it easier for beginners to understand. And I think it will give the readers some new model ideas, techniques to make their own forecasts for games and ways to help determine if a model that they've made is working out or not. So really, hopefully it will jumpstart our readers’ quest or endeavour to build eventually an EV sports betting model.

Ben:

Yeah, I can certainly testify the book sets out to do exactly what it says on the tin. I'm far from an expert on Excel and it certainly taught me a lot in a fairly short space of time. And with Excel being in the title - is that when we talk about your models and what you're doing with modelling in general. Is that what you're using or are you now looking at different programming languages?

Andrew Mack:

Well as I mentioned in the book, one of the reasons that I felt safe, I guess you will, to write the book is because I have moved on to R and to a lesser extent Python and my SQL databases and things like that. So, I'm using a more sophisticated process now. And obviously as your model becomes more sophisticated or requires more sophistication tools like that are only going to help you. So, some of the things that I was doing in Excel previously I'm not really using anymore. And so, felt free to be able to share those with other people and have them help other people learn and get up to speed more quickly, but at the same time not really have to reveal too, too much though what my current betting models are looking like or doing and so that was the impetus for the book.

Ben:

I think if you, you mentioned Excel or Python and stuff like that, you quickly get these battle lines that are drawn in the sand and people are quick to defend whatever they use the most. And I guess there's someone that's used those three that we just talked about. Could you maybe touch upon the strengths and weaknesses for each, for anyone that maybe doesn't know or only has knowledge of one thing?

Andrew Mack:

Sure. Yeah, I would say in some ways certainly with R and Python, it's very much a Coke versus Pepsi argument. Everyone has maybe a personal favourite, but whether or not that's totally justified, other than purely familiarity as not totally certain. I would say that what's great about Excel is that it's hyper visual. Its point and click interface makes it very easy for someone that is just getting started to understand what's going on behind the scenes. And the most important thing that you could do is pop the hood and understand what's going on with the engine you want to be able to click on a cell and be able to see the formulas that that cell is using because it helps you understand the processes and the functions, and all of that is going to build your understanding of what's happening.

And I think that that is one of the understated strengths of Excel is that it's so visual that it's easy to spot. It's easy to troubleshoot mistakes that you've made. It's easy to see what's going on and all of that can provide you with a greater understanding. I would say, that its weakness is probably data wrangling because as many people know, when you get any amount, any type of sports betting data, you are going to have to do a tremendous amount of data wrangling to turn that into usable information. So, it's going to have a blank inputs. So, certain players didn't play that game or they didn't record a certain statistic that game, they didn't get any rebounds or they didn't get any assists or whatever the data is there's going to be empty cells, empty values, there's going to be outliers.

So maybe inputs that aren't totally helpful to what it is that you're trying to do. So, generally with data science, you want to deal with your empty values or no values. You want to deal with your outliers. You ultimately would like it to be formatted in a way where statistical analysis is easy to do. And so how you structure the columns and the rows is very important. All of that in Excel, it can be quite challenging and so when you get into the really heavy work with thousands and thousands of data points, that part can be a little bit cumbersome in Excel certainly. And whether you're talking about importing the data automatically via scraping from a website or you are just wrangling the data, which could easily be 70, 80% of the data science process.

Those two things become fairly tedious in Excel and that's usually when most people start to think about maybe other options for doing it. So that's Excel, I would say, I've heard mixed things about R in terms of some people really think that it's quite user friendly, some people think that it is almost uninterpretable. And a lot of that seems to have to do with how much of a computer science background do you have. The computer science background people that I've talked to seem to love Python and have a certain amount of disdain for R just because of the way that the inputs are set up. I personally found R to be very user friendly. Although I don't have computer science background, so that might be an element to it. I found that the coding was fairly approachable and made a certain amount of sense.

I would say that its strength is that it was built primarily for academics. And academics usually pioneer the leading-edge packages. So, whatever … if there's a new machine learning algorithm that's just come out, it's very likely to show up on R first, so, you have a lot of really totally free cutting-edge tools that you can use with R. And I think that that makes it very, very useful for sports betting. Scraping, web scraping if you want to scrape odds and things like that. I don't know if it's quite as good as Python. I think that Python is a little bit easier to use for scraping web data and that to me is the real split. I think that R is really good for the machine learning and the statistical analysis. It's a little bit more cumbersome for the scraping.

Python is also very good for the machine learning and statistical analysis as well and a little bit better for web scraping. So really, either or whatever works for people.

Ben:

I think anyone, no matter what their preference, they can't have any complaints with that. That’s a nice description there, Andrew, very fair, a nice description from you. I guess one of the questions I often have with these languages is that people tend to say that they're similar to a spoken language in the sense that once you learn one, it then becomes a lot easier to learn another. Is that something that you'd agree with?

Andrew Mack:

I don't know if I would totally agree with that. But I found R was a lot easier to learn than Python and that's just me. Other people would obviously have different experiences. I found that Python was a little more, it seemed a little more technical to me, I guess would be the word I would use to describe it. Certainly not that you can't learn it and certainly not that there isn't some level of crossover because there is, but R and Python do have different ways of doing things, assigning variable names and other things like that and you get used to one and then you try to switch to another, sometimes, you make a few mistakes here and there and you go back and you fix it and carry on.

So, I don't know if … I guess in one sentence, in terms of getting your mind used to thinking in code, it is helpful. You learn one and then it becomes easier to learn another one. But obviously the details will be different for each of those respective programming languages.

Ben:

So, if someone … let's imagine someone's hovering over the purchase button on their book, obviously they need to be aware that as good as your book is, it's not a case of buying it, reading it and you're set to go, you're going to start making money from sports betting and through the modelling side of things. What are the ... obviously learning the languages is one, but what skills and traits do you think are important for someone who's interested in using models in their betting?

Andrew Mack:

Skills and traits. I would say first I would say positivity, which may surprise some people, but I think that, if you don't approach it with the right attitude, you're going to give up before you ever even get a sniff of success. So, I think that's actually a very undervalued attribute. A curiosity I would say, because really when you're trying to develop things for yourself, you have to ask instead of having a negative attitude, you predetermining why certain things won't work or are unlikely to work. It's much better and more helpful to be curious and to ask yourself, well, what if this worked or what if we could do things like this and you try it. So, a curiosity and a willingness to explore and experiment even at the risk of a foolish or silly experiment, I think is a really, really positive trait.

Also, the ability to think critically and analytically and to consider contrary or contemporaneous evidence to the contrary of why something might or might not be causally related or connected. Those are all good things to have as well and really just a desire to keep learning and keep improving because the moment that you think that you've finished learning is a dangerous moment for a modeller, because it's very much an arms race when you're building a model that is trying to outpace the market.

Ben:

My next question is probably going to be quite an obvious one, but someone needs to … They’re at this point in the journey to try and to become a bettor for whatever reason, these skills and traits that you've just described, they're aware of them. They haven't quite clicked them into place in the past. The thing is people want money to almost find that motivation. But if someone isn't at that point yet, how do you, what would you say to someone that says, look, I'm really struggling. I know what I need to do, but I just can't get that final little push to do it. Is there anything you'd say to them that would help them get there?

Andrew Mack:

Yeah. There's a quote from the founder of IBM that my dad used to say when we were little kids. That was, if you want to double your rate of success, you need to double your rate of failure. Which was, I always thought it was an interesting quote. There's something about that where you really have to get your hands dirty and you have to start making some mistakes because when you make mistakes and you learn from them, you will improve.

Whatever it is that you're doing. You continue to keep reading, you continue to keep trying to find new ideas, but above all, keep trying things and keep making mistakes because every mistake that you make, you're going to learn from that. You're going to realize, okay, that didn't work and you're going to build for yourself almost like a database in your mind of experimental ideas that worked or didn't work or looked promising.

And when you start amassing those, you start having better ideas. And so, a lot of people seem to think that, maybe they can just think about one thing coated, put it all together and boom, it's going to work and instant glory and riches and that's just really not the way that it works. You really have to, you have to learn from your mistakes and you have to make a lot of mistakes in order to get to where you want to be. I think there's another expression that an expert is someone that's made every mistake that can be made in a very narrow field. And I think that there's a lot of truth to that.

Ben:

Props to your dad, Andrew. I think he grew up in the right household. So, outside of inspirational IBM quotes. And if we're looking at tools or resources that people might use to not shift that mindset but to help them develop the actual skills to build models and help them to find success in betting. Are there any websites or blogs or any material out there that you've thought that was really useful in your journey?

Andrew Mack:

I would say there is one, and forgive me for starting off maybe at the more complicated end, but there is an eBook currently out by a guy named Jason Brownley from Australia and it's called Machine Learning Mastery. And it's an eBook series on both R and Python. And the premise of his eBook is essentially that developers don't always understand the statistical nuances of machine learning. So he put together these eBooks where he gives you a crash course in machine learning. And then walks you through templates of code to run all of the various machine-learning models.

So they're there, it's almost like a basic template for every machine learning model that you might want to start with, whether it's regression or classification. And he goes through a number of different things and basically the example code that he gives you … he uses some very simple example data sets, but the example code is worth a hundred times the price of the eBook because you have an example to visually see how to work this in. And you can take out the example data, plug in sports data and just try it and you will immediately begin slowly understanding how you might be able to apply machine learning to a sports data set.

And I think that's probably one of the best machine learning resources that's available. Very helpful and definitely helped me to get up to speed in both R and Python for the actual machine learning element of it. So, I would recommend that if people are interested in that they should definitely check it out. With regards to some of the more basic stuff it … Google is your friend. I think that that's been said before. I think Rob Pizzola said that in his chat with you. But it's so true. Google is your friend, if you have a little notepad and you write down statistical terms and things that seem interesting.

And then just start Googling them and trying to look for research papers or videos or anything that you can find that might help explain that. That's a very useful, you might want to know. What is a Bradley Terry model? What is a Glicko rating system? How is it different from an ELO rating system? What is the TrueSkill rating? What is multi culinary? What does a P value mean? How is it different from a T stat value? There are so many questions that you can ask and really how helpful Google is to you is very dependent on how good the questions that you ask are.

So that was definitely a huge point for me and I think that's something that a lot of people should focus on as well if they're trying to improve. What else did I do? I pretty much bought every statistical modelling and sports betting book that I can get my hands on. Absolutely outrageous number of books here at the moment. Let's see, basically I've got everything from algebra and calculus textbooks to probability textbooks, Bayesian textbooks, data science books for R and Python, machine learning model books. And almost every sports betting book that has ever been written.

So, try to read as much as I can about how other people have approached these problems. Not because I want to copy what they're doing, but because you never know when one little sentence or one little paragraph or one little example is going to give you a light bulb going off in your head that pays for the price of the book a hundred times over. And that's happens in almost every book. There's something in there that you think, Oh, that's interesting. Maybe I should try and work with that or give it a try. Where else can you look for good tools and resources?

I really liked reading statistical research papers to see what academics have done with regards to sport. There are literally thousands of masters theses and PhD theses where people have tried to solve the same problems and well, they've given away most of what they're doing in terms of methodology. There are some really good ideas there and you can get some really excellent concepts about maybe some superior ways to think about the problem. And if you have some tools to help you think about the problem, you're going to come up with better solutions.

So, reading research papers, even if you don't understand the entire paper, I think is a very, very useful thing to do. What else have I done? I've contacted and had meetings with the professors in the statistics department at university just to ask them some questions about various techniques, about Basien statistics and things like that. That was tricky, because academics are very risk adverse generally and they're not usually very keen on betting. So, you have to ask these questions without telling them that that's what you're interested in which can be a little bit of a tricky situation.

But I did learn some things from a few of the professors at the university here from just talking to them. That was handy too. And what else back to a previous point, but just experimentation, you never know what you're going to find just by trying different things. And one example I can give of that is there are a number of random Excel plugins for example, that some of them turn out to be pretty handy. So, I found a plugin for Excel called the Chess Ranking Assistant. And the Chess Ranking Assistant allows you to input the names of chess players in your high school chess team, group, whatever.

And input the match results and press a couple of buttons and it will automatically calculate the ELO ratings for all the players. It will automatically calculate the Glicko ratings for all the players and you can use that to predict it. It will give you a basic forecast for who would be likely to win a game between player four and player two. Not like you're going to be able to create, a full rolled out betting model with that. But that could be very handy for learning how, a rating system like ELO or Glicko work. So there's a lot of little plugins that you can play with to find new ideas and things like that.

Ben:

So just to try to put myself in the shoes of this beginner, this aspiring model builder that you very kindly lent them your library of books. They're doing all their can, reading all these materials. But for some people it's nice to have an example, and I don't want to kind of give too much away from your book or anything like that, but obviously the process of creating, testing, refining and measuring that goes into a model. Can you just kind of speak a little bit to what's actually involved and perhaps maybe use one of your previous examples?

Andrew Mack:

Sure. I actually prepared a new example for today because I know that a lot of listeners are interested in the finer technical details and sometimes they can get frustrated when we talk from a 30,000 foot view about these things. And so I have two examples, two different approaches that I think will be helpful. One of them is for props and one of them … actually, they're both for props. One of them is a little more traditional handicapping and the other one is more statistically oriented. So I'll just start with the quick one first, which is I don't know if you want to consider this another resource or if this is more of a process.

But for something like props, the best possible advice that I could give someone that's just starting, that has no idea what to do. Would be to pick one team from a league or one player from a team and watch that player or that team for an entire season and take notes. But not take notes about the things that fans are interested in. Observe the game from an analytical perspective and focus on the things that might add or subtract from a given prop line. So, the reason that I say that is because in many cases there are certain elements that are not fully included in the line that can become apparent if you watch the game with the right analytical approach.

And the example that I'd like to give is NBA rebounding props. Now most people know that rebounds are impersonal in nature and it's been talked about in a number of other books about, rebounding props and you can do a lot of statistical analysis on rebounding props, which is all very well and good. But this particular example, we're going to go in a slightly different direction, which if you say that you were to just pick one basketball player and watch them every game that they played for an entire season and take notes.

Don't watch where the ball necessarily is. Most of the fans are watching, or watching to see the ball go through the hoop, which is the entertainment element of it, what [you] really want to watch is the positioning on the floor where the players are situated relative to the other players who they’re guarding, who they’re matched up against. And some of the characteristics of that matchup. And the reason that I’m saying all this is because a player that is tasked with defending an opponent that has a different attack scheme or attack tendency can definitely affect that player’s rebound opportunities.

So if the opponent that the players matched up against is someone that likes to drive to the pain to make layups and close to the bucket shots that has a tendency to collapse the defence towards the basket, which means that your player in this prop scenario is closer to the basket and therefore more likely to get defensive rebounds, and defensive rebounds make up the bulk of the rebounding opportunities in a given game for the purposes of total rebound props. Conversely, if an opponent that- that this player is matched up against as a perimeter wing or a perimeter shooter, the defending player is going to be closer to the three-point line further away from the basket.

And as a result, they're not going to get as many defensive rebound opportunities because the centre or whoever is closest to the basket is more likely to grab those. And so, you would be able to ascertain that by watching the player over the course of a season and taking notes on those things, but it might not be fully included in the line and it might not be as easy to see that if you're just looking at the data of their average rebounds per game or their average rebounds versus this particular team or whatever else.

There was an interview by Bob Bulgaris where in the interview he said that one of his most important questions, and he described having six or eight televisions all in front of him where he would watch six or eight NBA games simultaneously and take notes too, to supplement his statistical model. But the question that he thought was one of the most important was what's not in the line, right? What information which we would be able to get from the game that's not already in the current market line. And so, I thought that was a great question. Very helpful question.

And that is just one little sort of almost example of a way that you could start to develop some insights of your own by watching NBA games for a total rebound prop opportunity. To get into the some finer details about how we might go about building a model. I would I guess I'll now transition a basic framework for how you might put together a model for a prop. Like expected strike outs in major league baseball. So in this example yeah, we're just going to do expected strikeouts for a number of strikeouts prop bet. Something that you can find on most bookies, including Pinnacle.

The first step for me is I actually brainstormed the ideal case. So I asked myself an open-ended question or conceptualize it in an open-ended way to try and get the brainstorming going. So I would say something like, if you could tell me exactly how many batters pitcher X will face and exactly what his strike out percentage would be. I could tell you his expected number of strike outs. Now that's just to get the ideas moving. It's obviously we're never going to be able to do that perfectly, but we want to try and come up with ideas that might help us approximate that. So that leads to the next question, which is: What information would we need to try and approximate this or turn this into a reality?

Well, the first step is going to be to create a base expectation for strikeouts or strikeout rate because there's more than one way to do it. If you look at pure strike outs, there's going to be a little bit of statistical noise in there because we're not really mapping the underlying process. We're mapping the result and which is something that I talk about a little bit in the book, but we might want to take a say saying [sic] that we're going to do Walker Buehler as the pitcher and we may be the first thing that we should do is take all the games he's played, how many strikeouts he's got and do a distribution fitting to see what distribution appears to fit the data.

And we might discover that it's largely push on in nature, which will help us when we want to convert our base expectation into a probability later. So back to creating the base expectation, how can we take strikeouts and convert them into something that maybe is more predictive, like an expected strike rate because what's actually happened, there'll be a little bit of noise and if we could get rid of that would really be handy. Well, we'll probably do strike out rate. We'll probably be using some permutation of a regression analysis of some kind. And as it turns out there are a number of resources online available for this particular prop.

And certainly, there are more advanced ways to do this, but it's possible to create a regression for expected strike rate using a picture of strike percentage: Their looking strike percentage, their swinging strike percentage and their foul strike percentage. And if you put all those together, you can come up with a regression model that does a pretty decent job of mapping the expected strike rate. So if we have now an expected strike rate as a base forecast, we now need to make an a few additional refinements, right? So obviously there are a few other things that we know about this game that's about to be played that we can use to help us.

The first is the opponent, right. What's the opponent's strike rate look like? And we can go as deep into this as necessary or as high level as necessary. But for the purposes of simplicity, let's just assume that we're looking at- we're just going to look at some team-based stuff. So the opposing team, we might want to know how good or bad are they in terms of their strike rate. Are these a- is this a team that strikes out a lot because they're swinging hard at- at every pitch or are they very selective and they walk a lot and whatever?

So an easy way to do that would be to take the teams, the opponent team strike rate and divide by the league average strike rate to get a percentage adjustment. And if we do that for every team, we will end up with 0.99 or 1.02. So 2% above average, 1% below average, etc. If we use that and multiply it by the expected strike rate, we now have an adjustment for the opponent expected strike rate as well. So now we have the pitchers expected strike rate based on a linear regression as well as an opponent adjustment.

So if we take that and say that that's our base expectation which, which has now been adjusted for our opponent and it'll be a simplistic way. We now need to know a few other things to answer the question that we pose to ourselves in the, at the outset of this. We need to know how many batters we expect Walker Buehler to face for every inning that he pitches. And we also need to know how many innings do we expect them to pitch because obviously the larger the number of batters that he faces at a given strikeout rate, the higher the number of strikeouts should be, which only makes sense. So that's the next step of the process.

So we might … if we could just look at averages to be simple here. Or like I said, do you know, you could you can go much more in depth than that and certainly you should. But if we take the expected innings pitched and we times it by the expected batter's face printing pitched and we times that by our opponent adjusted expected strike rate, we should have an expected number of strike outs and then that would be the number that we can take and plug into a personal distribution, and then derive a probability for whether Walker Buehler should be over or under X amount of strikeouts given a market price.

And that would be a basic walk through of a prop model. There are other ways to do it. Of course, you could take the mean and standard deviation for Walker Buehler's expected strikeout rate. You could run some Monte Carlo simulations of varying complexity track those projections for multiple pitchers against the market using log loss, RMSE, Mae, stuff like that. But that would be a basic process.

Ben:

I think the law school and the betting, Andrew, you might have to look into becoming a lecturer. Thanks. That was really great. I'm sure the listeners found that very useful. Thanks for that. What I'd like to know then as you get, you get these outputs and your model gives you something of a probability where you're then looking at a discrepancy between what you derive and then what the mark actually posts. I'm guessing here that you don't go then straight into using Kelly or whatever it might be to start betting. Are you suggesting people test back, test against historical data sets of odds, or is it small stakes to start with to test things out? What's the next step from there?

Andrew Mack:

Well, yeah you definitely want to do all of the above. I would say for the love of God, don't go full Kelly immediately. When you find a discrepancy, that's a recipe for disaster. For a number of reasons, not the least of which is that full Kelly underestimates the probability of ruin. And if you burn out your entire bank roll, you are effectively finished for the time being as a sports bettor. So that needs to be kept in the forefront of your mind at all times. You definitely want to track this number of ways to track it. Obviously, [there are] lots of different arguments about the efficacy of closing line value and how predictive it is.

What I think is most useful for people to know is that closing line value is a metric that can provide you with a better estimation of long-term profit potential. And this is the key with fewer trials than profit or ROI. So, if you don't want to back test it on an entire season of props and collect all that data and whatever, you could get a pretty good idea of how well you might do by tracking the CLV paper trading on a smaller number of trials, which can make the CLV very handy for that. How you actually want to track it is somewhat of a matter of a personal preference.

If you probably certainly want to consider log loss because obviously we're creating our own probabilities and then we're comparing them to the market's probabilities, so those probabilities are better. That's something that we definitely want to know, which log loss can help with. If you would prefer. You can also look at the you mean absolute error, which would tell you, who's doing a better job of actually predicting the number of strikeouts or the count data in this case. And that's true. Whether it's a point spread or a total or in this case number of strikeouts. So there's a number of ways that you can track it.

But you definitely should track it to see if you have a reasonable expectation of profit. And if you do you then it's time to start looking into, okay: Total bankroll size, some variation of fractional Kelly, like quarter Kelly or half Kelly or if you're, if you're not totally confident after testing it and tracking it, you may want to use flat staking. But whatever you use, you definitely want to track it once you think that you have something that might work.

Ben:

And then with the CLV side of things, and I think you spoke a little bit earlier about these derivative markets that you're personally attacking because you believe that they're inefficient and things like that. With that, is that because you're then using that to measure, obviously you don't want to be using ROI, but is it closing line value is the only thing that increases your sample size to test your relative success?

Andrew Mack:

I would say that what I like about CLV is that it can give me a better picture of how the model is likely to do in the future with a smaller sample size. So if for example, you were going to just paper trade a month of games, in doing that with tracking CLV should theoretically give you a better snapshot of how this model is likely to do as opposed to just ROI or profit, where you would need literally thousands of trials in order to get a P value that was sufficiently low to reject the null hypothesis. So I think that CLV is useful in that respect. And I know people have very strong opinions about these things. You see all [sorts of] wild arguments on Twitter about that.

Ben:

So I guess for those specific ones we could say maybe a benchmark, not necessarily this Holy grail of the line for success in betting.

Andrew Mack:

For a lot of the markets point spreads and totals, even if it's in a quarters or halves, I really like mean absolute error. I like to see that. I have a model that maybe is beating the book on them in terms of mean absolute error. So, for example, I recently created a model and I don't want to talk too much about it, but it took about 60 hours to build and it's got about a 0.45 mean absolute error advantage over a certain book that's not Pinnacle. And that's something that I definitely like to see. Especially for something like a point spread or total. I think that MAE is very useful. Although again, you can't just take 10 games and compare the MAE and then decide that you've got the magic wand that's not going to work. If you're going to use something like MAE or log loss, you want to have a large number out of sample trials in order to make sure that whatever you're seeing is statistically significant. Because you can have a model that looks really great for the first 200 bets or the first 500 bets. But whether or not it still looks great after 5,000 bets is a very different story - frequently.

Ben:

And this so far we've been fairly positive and I think the whole point of this podcast is to educate people about the modelling side of things, but there are obviously people out there that are not so positive or not so fond of using that approach. And I mean fair play, if any of those who've listened to an hour of this podcast and got to this point, but if anyone is listening and says the classic “This sport isn't played on a spreadsheet” or whatever it might be, I think there's NFL coaches out there that very recently been saying, “Oh, we don't trust the analytics guys as much as we should”. If you're put in a room with that person, you've got say, one minute to pitch your case, what do you say to someone that thinks like that?

Andrew Mack:

Well, I guess, if that's the criticism, I guess I would say that the beautiful thing about science is you don't have to believe in it for it to be true. It's not a belief system and it's not a, it's not a faith system. It's not about whether you trust it or not. It's about analytics in terms of you should go for it on fourth down here or something. In whatever situation the game may be currently in that they're not belief systems. They are there things that are objectively demonstrably true. And so I would say that, for coaches and people that are interested in objectively demonstrably winning you might want to look at things that can increase that likelihood, which are also demonstrably true.

I would also say that I think the main criticism (and lots of these criticisms are justified and) need to be talked about. And at least considered when it comes to modelling, the main criticism that I think you hear a lot with modelling is that this model can't add anything to the market line. The market contains so much information about this game that it's very unlikely that your model adds anything different. That's a serious criticism and it's something that needs to be taken seriously. There's a few caveats that should probably be mentioned as well for a basic model. I think that that is largely true and although you can alleviate that to some extent with techniques the bender boost that I talk about in my book and other market ratings.

This is a real concern especially for basic models, but certainly this is not always true. It's not always true or sophisticated models and it's certainly not true for all games. Not every game is equally efficient to every other game. And so I think that what you can see is that while the market on the aggregate is very efficient as a whole. Individual games can be inefficient. And the whole idea of modelling is to be able to try and identify those games and to maximize, the advantage that you have by singling out those games. I guess the other thing that I would mention is that, the market, the book line and a modeler don't share the exact same objective, which is a beautiful thing that makes us profitable sports bettors possible.

We're not- the bookmaker, Pinnacle is not trying to necessarily have the exact probability of the game. Obviously, that's a large part of what they do because they’re such a sharp book. But they're also … they have to manage risk and that risk, it necessarily entails dealing with the proclivities of the market. And if the market overwhelmingly thinks that one side versus the other is more likely to win and bets more on that, they are going to be incentivized to adjust their line and to move the market price. Even if it disagrees with what they may think using their own is a more accurate probability.

And so, you can't add anything to the market line on the aggregate for a basic model. Yeah, that's a valid concern for every game. All the time. Even using a sophisticated model. I don't think that that's true.

Ben:

And I think you suggested earlier, but do you think there's room in, if you're taking an approach with modelling your sports betting, is there room for intuition? Can you react or jump on something before your model has a chance to adapt?

Andrew Mack:

Sometimes. If you pay a lot of attention to line moods and the changing marketplace, I think that, and you know your model very well. I think that there are times where you can look at the market pricing and you can say, Oh, that ... without even looking at the model that looks very interesting. Now I wouldn't bet on that without actually running it through a model. But there are definitely times where I can look at the slate of games for a day and the prices and I can say that one looks really interesting. I really want to investigate that particular game. And so, I don't know if you'd really call that intuition or not or if that's just the benefit of some experience. But I definitely think that, there is maybe an element of that but certainly I would not do all of this modelling work if I was just going to get a shot at these games like I did seven years ago.

Ben:

Right. I've got a couple of … there's a couple of final questions I want to ask you. One of them just have to do with education. You've obviously taken the step to write a book to try and help educate people, Pinnacle’s really big on education and the articles we put out there, and we are seeing a lot more educational content being put out. And what that's leading to is this more, it certainly seems like a more, educated betting audience and the fact that we were doing a podcast about modelling and sports betting shows how far things have come. Do you think from a personal perspective or experience, do you think this approach is catching up with bookmakers and they're struggling to handle the way people are betting nowadays?

Andrew Mack:

That's a really interesting question. Like I said before, modelling against the market is very much an arms race. It's never a process that's finished. It's just always a question of relative strength and as long … so you always have to try and keep your relative strength as high as possible against a rising tide of increasingly more sophisticated competition. I don't know if bookmakers are really struggling to keep up. A lot of bookmakers have sort of a recreational model where if it appears to them like you have a reasonable expectation of profit, they're going to limit you or close your account or whatever. Are they struggling with this?

No. And the opposite side, a bookie like Pinnacle, I doubt that they're struggling with this as well because they incorporate that'd be the information of their sharpest betters into the line. And so, I think that Marco Blume referred to the sharp bettors in a conference, a talk that he gave as consultants, which I found very interesting and insightful, very instructive in terms of the relationship that Pinnacle conceptualizes with the sharp betters, at their site. But it doesn't in any way suggest to me that they are struggling to keep up with this because I think that they're able to use that and incorporate it into their trading algorithms to manage the risk, the way that they best see fit.

And so does it mean that that inefficiencies are getting smaller? In many ways, yes. I think that's true. I think that you are seeing the expected edges on a lot of larger markets getting smaller and getting harder to keep up [with]. But I don't think that the bookies - either very recreational bookies or very sharp bookies would be particularly struggling. Although I should say that I don't have any access to that information, so that's just speculation on my part.

Ben:

Final one for you, Andrew, then, is what does the future have in store? Are you going to carry on down this legal sector route or is the goal to go full time with betting once you've completed your studies? What's the plan?

Andrew Mack:

Well, right now, I'm leaning towards betting. I'm taking a run at doing this professionally. I think that when law school is finished and I'm able to do this every day, that I have a reasonable expectation of profit in enough different markets and with a large enough bankroll that I think I've got a decent shot at this. So I think that's what I'm going to do. At the end of the day I can probably make more betting on sports than I can in my first couple of years of being a practicing lawyer, and I don't know if that's maybe a sad statement on the current legal market economy in Canada or what - but not only does it seem like I could probably do better at this, it's something that I really enjoy.

Like, I don't mind putting in 12-hour days if that's what it takes with sports betting. And so that's definitely where I'm … the way that I'm leaning. A couple of things are going to have to happen for that to work out for me. I'm definitely going to have to pursue more higher levels of automation and I'm also going to have to look to increasing my betting volume substantially, which I think is why I'm spending so much time on perfecting my models and still betting, of course, but [I’m] really focusing on the modelling work right now because I'd like to have as many tools in my arsenal [as possible] so that in about eight months' time here I'll be in a good position to move forward.

Ben:

Best of luck. I think that's all we have time for today. You’re clearly very busy. So for myself and the listeners I want to say thanks for coming on to share your insight, and helping us learn more about everything to do with modelling and sports betting.

Andrew Mack:

Well, it's been a pleasure, Ben. Thank you for having me.

Ben:

And Andrew's book Statistical Sports Models in Excel is available on Amazon, and you can follow them on Twitter using what I assume is a Wu-Tang Clan inspired handle with @Gingefacekiller. Is that right, Andrew?

Andrew Mack:

That's right.

Ben:

There we go. And you can keep up to date with Pinnacle on Twitter by following @pinnacle. If you want to learn more about what we discussed in today's podcast and anything to do with betting in general, then head to betting resources, the betting resources section on the Pinnacle website. Thanks for listening and bye for now.

Listen to the Pinnacle Podcast on Soundcloud

Check out out our other podcasts here or you can follow us on Soundcloud and subscribe to us on Apple Podcasts and Spoitfy.

If you enjoyed this article read our betting strategy articles or visit Betting Resources for more.

Betting Resources - Empowering your betting

Pinnacle’s Betting Resources is one of the most comprehensive collections of expert betting advice anywhere online. Catering to all experience levels our aim is simply to empower bettors to become more knowledgeable.