Simulating the Head-to-Head MAFL Fund Algorithm

Over the past few months in this journal we've been exploring the results of simulations produced using the five parameter model I first described in this blog post. In all of these posts the punter that we've been simulating has generated her home team probability assessments independently of the bookmaker; statistically speaking, her home team probability assessments are uncorrelated with those of the bookmaker she faces.

Head-to-Head Probabilities: Historical Performance

In this blog, however, I want to simulate our Head-to-Head Fund's probability-generating algorithm, which uses as one of its inputs the TAB Sportsbet Bookmaker's probability assessments. As such it's inevitable that its probability assessments will be correlated, probably highly, with the bookmaker's. To find out exactly how correlated they are I ran the Head-to-Head (H2H) algorithm for every game from Round 1 of 2006 to Round 10 of 2011 to produce the results in the following table.

The numbers on the left provide the correlations we're after, which I've calculated for each season separately, for all seasons combined, and split by whether the game finished as a home team loss (or draw) or a home team win. Generally, the correlations are all in the range of about 0.80 to 0.90, the notable exception being the correlation for this year for games resulting in a home team loss (or draw) where the correlation has been only 0.69. 

The correlation has been relatively low this year for home team wins too, resulting in an overall correlation between H2H's probability assessments and the TAB Sportsbet Bookmaker's of just 0.82. There's no obvious reason I can come up with for this lower-than-average correlation - which certainly hasn't proved to be a source of profit - so for now I'm putting it down to statistical variation.

In the middle section of the table are the probability scores for the TAB Sportsbet Bookmaker, for H2H Unadjusted, and for H2H Adjusted (ie H2H with the maximum allowable positive difference between the H2H home team probability assessment and the bookmaker's assessment capped at 25%). It's apparent from this data how relatively high the bookmaker's probability scoring has been this season. This has been due to the relatively large number of victories by short-priced favourites - a team rated an 80% chance by a probability tipster provides a probability score of 1+log(0.8) or 0.678 when it wins; one rated a 60% chance produces a probability score of just 1+log(0.6) or 0.263 (remember that the logs are base 2 for the probability score metric I'm using).

On the far right are the ROIs produced from various forms of Kelly-staking. The first column applies to Kelly-staking home teams only using the unadjusted form of H2H probability assessments, the second to Kelly-staking away teams only using those same probability assessments, and the third column applies to Kelly-staking home teams only using the adjusted form of H2H probability assessments with the added restriction that home teams priced at over $5 are never wagered on (ie this third column is how MAFL's current Head-to-Head Fund operates).

A few things are evident from this section. Firstly, the H2H probability assessments generally perform poorly for away teams, though they have generated a profit so far in 2011, so they're not useless always and everywhere. Also, if you compare the third column with the first you can gauge the efficacy of the adjustments made to H2H's raw probabilities before they're turned into wagers. For every season except 2006 (which I discount because the algorithm was forced to use what I consider to be less reliable 2005 data for its predictions in that season) and the current season these adjustments have increased ROI. More on this topic a bit later.

Across all seasons, even including 2006, the H2H algorithm, adjusted or unadjusted, has produced a positive ROI when wagering on home teams only.

Modelling the Head-to-Head Probability Assessments

Time, once again, for a dose of Eureqa, this time to derive the relationship between the home team probability assessments of the TAB Sportsbet Bookmaker and those of the Head-to-Head algorithm.

What we come up with is (approximately) the following:

Expected Head-to-Head Home Team Probability = 1/(5 + 4 x Bookie Implicit Home Team Probability^2 - 8 x Bookie Implicit Home Team Probability)

This equation explains about 78% of the variability in H2H's probability assessments. Here's what it looks like:

The blue line is the equation and the red dots are the actual Head-to-Head probabilities. Note that, for the range of bookmaker probabilities actually observed, the fitted equation never drops below 20% (although a small number of actual H2H probabilities do) and that the fitted home team probability is always above the bookmaker's home team probability. In this sense, the fitted equation suggests that the H2H algorithm has a natural home team bias.

What then of the errors - the difference between the fitted equation and the actual Head-to-Head algorithm probabilities? It turns out that they have the following characteristics: 

  • If we split the errors for the 1,003 games into two groups, one for those games in which the home team won and the other for those games in which the home team lost or drew, we find that each group of errors is Normally distributed with slightly different means and standard deviations.  I won't even bother to show the charts of the errors because the differences between the distribution of these errors and those of a Normal distribution with the same mean and standard deviation are barely evident. Basically the errors are something that a Normal CDF would be proud to call its own.
  • The errors for games in which the home team lost or drew are distributed as a Normal with mean -0.02% and standard deviation 10.5%.
  • The errors for games in which the home team won are distributed as a Normal with mean +0.89% and standard deviation 10.5%.

So, when the home team wins, the actual probability assessments of the H2H model are about 0.91% higher than the fitted equation would suggest, and we already know that the fitted equation is somewhat biased towards the home team relative to the bookmaker's assessments. In short, the H2H model tends to be more confident about home team victories in those games that the home team wins; that's a good thing and is an important aspect for us to include in any simulation of the H2H algorithm.

The home team bias has not, however, been present in H2H's probability assessments in every season, and has been especially absent this season, as the following table hints at.

Here we see that, though the average prediction errors for 2006, 2008 and 2009 were all negative, this season has been exceptional in this regard with the fitted equation over-estimating the true Head-to-Head algorithm probabilities by almost 3.5% points per game.

What's also been different so far this season is the standard deviation of those probability assessments about the fitted equation -  it's been almost 12% compared to the all-season average of just 10.5%.

So Start Simulating Already

Here's the setup for this latest round of simulation: 

  • We'll simulate 1,000 seasons of 185 games duration, each season replicated 250 times.
  • For each season we'll choose a bookmaker sigma of between 2.5% and 7.5% and a bookmaker home team bias of between -5% and +5%, each modelled as random uniform variables.
  • To select the home team probability for each game in a season we'll use the cumulative density function that I've derived from historical results (see this blog) plus a season-specific home team bias of between -5% and +5% to model the fact that home teams can be, on average, weaker or stronger in different seasons, season 2011 being a case in point.
  • The bookmaker's probability assessment for each game will be modelled as a random variable from a Normal distribution with a mean equal to the true home team probability plus the bookmaker's bias for the season, and a standard deviation equal to the bookmaker's sigma for the season.
  • We'll constrain the bookmaker's probability assessments to lie between 1% and 99% and then convert these probability assessments into market prices using an overround, which is modelled as a random uniform variable that takes on values between 105% and 108% for any given season.
  • To generate the punter's home team probability assessment for a given game we'll generate a random Normal with
    • mean equal to the bookmaker's implicit home team probability, derived from the market prices it offers, minus 0.02% if the home team loses the simulated game or plus 0.89% if the home team wins the simulated game
    • standard deviation equal to 10.5% for every simulated game 
    • (In the simulated world we have the advantage that we can use the actual result as input to the simulated pre-game probability assessments. Ah, if only we had this luxury IRL.)
  • We'll also constrain the punter probabilities to lis between 1% and 99%.

For each simulated season we'll record the wagering performance of the punter if she punts using the raw probability outputs and if she punts using these probability outputs but with the additional constraints used for the H2H Fund. We'll also track Kelly-staking and Level-staking.

To start with let's look at the proportion of simulations for which differing wagering strategies turn a profit:

  • If we, when the price is right, wager on the probability assessments on Home and Away teams, Kelly-staking is profitable in 37% of the simulated seasons and Level-staking is profitable in 53%.
  • If, instead, we wager only on Home teams when the price being offered is deemed to be profitable, Kelly-staking is now profitable in 39% of simulated seasons and Level-staking is profitable in 44%.
  • Finally, if instead we wager only on Home teams when the price being offered is deemed to be profitable and we apply the adjustment rules used by the Head-to-Head Fund in MAFL for 2011, Kelly-staking is now profitable in only 29% of simulated seasons. (I didn't store the results for Level-staking using this approach, more's the pity.)

These are interesting results, especially in the context of current MAFL Head-to-Head Fund practice which appears to be utilising a sub-optimal strategy if these simulations are an accurate indicator of the wagering environments that we could face armed with the H2H algorithm. It might be, however, that a strategy that produces profits less often nonetheless tends to produce larger profits on average.

To investigate this possibility I fitted least regressions to the ROI for the five strategies, in each case using two regressors that would be available to us in practice and not just in the context of a simulation, namely the overround and the difference between the bookie's probability score and that of the H2H algorithm.

For all the models we see that a 1% point increase in overround reduces expected ROI by between 0.7% and 0.85% points, much as we found in the What's 1% of Overround Worth? blog over on the Statistical Analyses journal. Level-staking appears to be slightly more effected by changes in overround than is Kelly-staking.

Note that the fit of all models is acceptable, except that for Level-Staking in all games where it drops to only just over 50%. For this wagering strategy, probability score differences and overround fails to account for a sizeable chunk of the variability in ROI from season to season.

Perhaps the best way to get a handle on what these fitted models mean is to plot fitted values holding overround fixed at 107% - roughly the historical TAB Sportsbet average - and systematically vary the difference in the probability scores within the range that was produced across the simulated seasons.

If we plan to wager on every game then Kelly-staking is preferable to Level-staking for differences in the probability scores of about 0.035 per game or less.

Potentially more lucrative would be switching to wagering only on Home teams, in which case Kelly-staking is only more profitable - in the sense of reducing the magnitude of the loss - than Level-staking when the difference in probability scores is 0.070 or more. If the difference in probability scores is small enough to make wagering, on average, profitable, Level-staking is preferable to Kelly-staking.

Imposing the adjustments on H2H probabilities and wagering only on Home teams appears to be a generally unrecommendable strategy since, for every reasonable value of the difference in probability scores, at least one of the four other strategies is superior.

Let's take another look then at the realised history for the three of these strategies that involve wagering on Home teams only, and let's add in the strategy of Level-staking using the adjusted H2H probabilities:

(I've again included the 2006 results for completeness only; I ignore them in the commentary that follows.)

The strategy that's currently being used for the MAFL Head-to-Head Fund produces superior returns in 2009 only. Level-staking with adjustments is superior in the remaining full-season cases, that is, in 2007, 2008 and 2010, while for 2011 (up to and including Round 10), Level-staking using unadjusted probabilities has been superior.

Looking solely at Kelly-staking, using adjusted H2H has been superior to unadjusted H2H for every season except the current one.


Simulations of Home team only wagering using the H2H algorithm suggest that Level-staking will, on average, be more profitable than Kelly-staking whenever the algorithm's probability assessments are sufficiently accurate to allow a profit to be produced. These simulations also suggest that adjusting H2H probability estimates and precluding wagers on Home teams priced over $5 will, on average, lead to lower returns from Kelly-staking.

Actual performance results are more equivocal, but generally support the current practice of Kelly-staking adjusted probabilities with the $5 cap. Nonetheless, we should continue to monitor the respective performance of a range of wagering strategies using the H2H algorithm outputs.