Lately it seems I've been specialising in blogs on topics that I've covered before, and tonight's blog is no exception. It's on estimating the probability of a draw.
In the previous blog on this topic I approached the issue by treating each of the games between 1999 and 2010 as either a draw or a non-draw and then fitted a binary logit to this 0/1 variable with regressors of team MARS Ratings or Implicit Bookmaker Home Team Probabilities (which were Overround-Equalising probabilities as it happens, not that I made the distinction then).
Today my approach will be a little different. In the most-recent blog in this journal I presented a highly accurate Margin Predictor based on the Log Probability Score Optimised Bookmaker Probabilities, which was:
Predicted Home Team Margin = 19.83089 x ln(Prob(Home Team win)/(1-Prob(Home Team win)))
where Prob(Home Team win) = 1/Home Team price - 1.0281%
If I use this Predictor for all games from 2007 to 2012 and then calculate the Actual Home Team Margin less the Predicted Home Team Margin for every game, I wind up with a distribution that looks Normalesque (mixed, perhaps with a dash of witch's hat).
The distribution has mean -0.16 points per game, a standard deviation of 36.8 points per game, skewness of -0.015 and excess kurtosis of +0.217. It's a little too peaked to be a poster-child Normal distribution and there's some shenanigans going on in the tails, as evidenced in the QQ-plot on the right, but both the Shapiro-Wilk and the Jarque-Bera normality tests fail to reject the null hypothesis that the data comes from a Normal distribution.
So, at worst we can say that it's not not a Normal. I'll continue by assuming that it's near-enough one for our purposes.
If that is the case then we can work as though:
Actual Margin ~ Normal(Mean = Predicted Margin + 0.16, SD = 36.7)
and the probability of a draw in any single game will then be given by:
Probability of a Draw = Probability(-0.5 < Actual Margin < 0.5)
which we can estimate using the CDF for the Normal distribution.
Since the Predicted Margin is a function of the Bookmaker's Implicit Home Team Probability the cumulative density is too, so we can plot the probability of a draw as a function of the Bookmaker's Implicit Probability, or as a function of the Home Team price, since we can derive one from the other. I've plotted these two relationships below:
The probability peaks at about 1.08%, roughly where the Home and Away teams are equal-favourites, as you'd expect. It falls away either side of equal-favouritism, reaching about 0.5% - or 200/1 odds - for probabilities around 10% and 90% for the Home team.
Over the period 2007 to 2012 there have been 14 draws in 1,144 games, which is a rate of 1.22%, well above the maximum estimated probability of a draw in any one game. In fact, if you sum the estimated probability of a draw across all 1,144 games you get just over 10, on which basis we've had 4 more draws than we would have been expected.
It could be the case then that the probability estimates produced by the approach I've outlined here are a fraction low or, alternatively, the estimates could be accurate and we've simply witnessed a few more draws than we would have expected given the actual distribution of Home Team prices across the 1,144 games.
One thing I can rule out is that the assumption of Normality is causing any underestimation. As a separate exercise, the details of which I won't go into here, I fitted a purely empirical cumulative density function (CDF) to the values of Actual Margin - Predicted Margin and then used this empirical CDF to estimate the probability of a draw in each of the 1,144 games (in essence by estimating how often the difference between the Actual and Predicted Margin in all games had been within half a point of the Predicted Margin for the particular game). This yielded an expected number of draws of 10.5, still 3.5 fewer than we observed.
On balance then, I'd favour the view that we've just had a few more draws than we might reasonably have expected - which makes the usual TAB price of $51 for a draw seem all the more unattractive.