The Predictability of Game Margins

In a recent blog post I described how the results of games in 2013 have been more predictable than game results from previous seasons in the sense that the final victory margins have been, on average, closer to what you'd have expected them to be based on a reasonably constructed predictive model. In short, teams have this year won by margins closer to what an informed observer, like a Bookmaker, would have expected.

Last year, in a similar vein, I postulated that the final margins for games involving mismatched opponents might be intrinsically more difficult to predict (ie more variable about an informed observer's expected value) than the margins for other games, but I dismissed that hypothesis in a very simple fashion by looking at the correlation between handicap-adjusted game margins and the bookmaker's handicap. The low value of this correlation suggested that any "excess" margin was linearly unrelated to the size of the pre-game handicap.


Statistically, "errors" - that is, the unexplainable portions of the game margins we observe - that vary from one game to the next are called heteroskedastic, and the presence of such errors has implications for the linear models that we build to predict game margins. Specifically it means that the model coefficients we estimate, while unbiased, are less precise than they would be in the absence of heteroskedasticity. Put another way, heteroskedasticity reduces the efficiency with which we extract information from the available data.

Time then to formally test for the presence of heteroskedasticity and to see if we can model its drivers. To do this I need to:

  • Fit a model to be my "informed observer", which describes actual game outcomes as a function of relevant variables, much as we did in the previous blog (and in many others besides). Such a model will provide our estimate of the expected margin for each game and allow us to, by subtraction, estimate the "error" or "residual" in the actual game margin
  • Test the (squared) residuals from such a model to investigate whether or not they are statistically (linearly) related to the expected margin or to any other specific variable

I fitted two models to the margins for all games from Round 1 of 2006 to Round 17 of 2013, to act as my "informed observer":

  1. Game margin as a function of team MARS Ratings and the Interstate Status of the clash
  2. Game margin as a function of the TAB Bookmaker's Implicit Risk-Equalising home team probability only

For both models I tested the heterogeneity of the model residuals in relation, firstly, to the model's fitted values using the ncvTest function from the car package in R. The null hypothesis of homogeneity of the residuals' variance could not be rejected for either model. So, there's no strong evidence that the variability of game margins about their expected values differs - strictly speaking, differs linearly - with respect to the size of that expected game margin.

While investigating R's resources for testing for heteroskedasticity, I also came across the bptest function from the lmtest package, which allowed me to test for the existence of a linear relationship between the (squared) model residuals and any of the regressors used in the model - or indeed with any other variable we can come up with that was not used as a regressor but that might drive heteroskedasticity. Mindful of that earlier post demonstrating the apparent empirical reduction in the variability of game outcomes about their expected values in season 2013 in particular, I thought that Season would be an interesting variable against which to test for heteroskedasticity.

The bptest results for both models again failed to reject the null hypothesis of homoskedasticity, which means that there's no strong statistical evidence of heteroskedasticity relative to any of the regressor variables or to Season, either. So, despite the apparently substantial decline in the variability of game margins in 2013, in the context of the period from 2006 to the present, that decline is not yet statistically significant.


As esoteric as this result might seem, it's fairly profound. What it tells us is that, statistically speaking, we can feel confident about applying a single least-squares regression model to the problem of fitting the margin outcome of any game across the period 2006 to the present, regardless of (say) the relative skills of the competing teams - confident at least in terms of the assumption we make in fitting such a model that the residuals are homoskedastic.

In simpler terms, what it tells us is that, for example, when a team MARS Rated at 1,050 meets a team MARS Rated at 975, such a contest is no more likely to produce a game margin far in excess of (or far below) its expected value than is a game pitting two teams MARS Rated at 1,000.

This is an assumption that I've been making implicitly for some time, but it's comforting to have its validity confirmed statistically. It's nice to be both unbiased and efficient.


During the course of performing the analysis for this blog I, as mentioned earlier, came across the car package. As well as offering the ncvTest function to formally test for heteroskedasticity, this package provides a residualPlots function, which allows us to search for nonlinear relationships between the variables we've chosen as regressors and the target variable we're trying to fit. This is important too, since when we fit an ordinary least-squares model to data, as well as assuming that the errors are homoskedastic we implicitly assume that the relationship between each regressor and the target variable is linear in form.

Using this function provided me with an excellent way of detecting a potential problem with using untransformed Bookmaker-derived probabilities, which I'll demonstrate in relation to using the Risk-Equalising version of such probabilities.

Before I do that though, here are the residual plots for a more well-behaved model using only MARS Ratings and Interstate Status as regressors.

For this model all looks well, with the residuals showing no apparent relationship with any of the regressors nor with the model's fitted values. (The residualPlots function provides some formal statistical tests for the curvature of the relationships shown here and none suggests that the curvature in any instance is cause for alarm. There might appear to be some non-linearity with respect to the Interstate Status variable, but bear in mind that the number of cases where this variable takes on a value of -1 - that is, where the nominal home team is playing in the home state of the nominal away team - is very small.)

Now this model has an adjusted R-squared of about 32.5%, which is acceptable, but it turns out that we can do a little better (32.7%) using the Risk-Equalising variant of the TAB Bookmaker's Implicit home team probability.

The residual plot for this model, however, shows some non-trivial levels of curvature, suggesting that using the Risk-Equalising variant of the Home Team Implicit Probability leads to us under-estimating the game margin (from the home team's perspective) in games where the home team is a raging favourite and also where it's a whimpering underdog.

It appears that this understatement is about 1 or 2 goals in magnitude, which could hurt you if you were looking to apply this model to line betting.

One way of addressing problems of nonlinearity such as is depicted here is to transform the regressor variable, seeking to preserve the information contained in it but linearising its relationship with the target variable.

In the previous blog - for what were, in hindsight, similar motivations - I transformed the Risk-Equalising probability by converting it to a probability density value. Here, instead, after looking at the shape of the relationship in the residual plot I chose the following, simpler transformation:

Bookie_Prob_RE_trans = Bookie_Prob_RE / (1 - (Bookie_Prob_RE - 0.5)2)

This transformation increases values near to 0 and does the same to values near to 1, slightly moreso in the latter case.

Using this transformed version of the Risk-Equalising variable lifted the adjusted R-squared by a full percentage point to 33.6% and essentially linearised the relationship between it and the target variable, the game margin.

It did this without producing a model showing any signs of heteroskedasticity of the types tested.

All that remained was to combine the regressors from the two models I'd created so far.

This gave me the following fitted model:

Game Margin = 25.44 + 55.48 x Bookie_Prob_RE_trans + 0.3165 x Home Team MARS Rating - 0.3690 x Away Team MARS Rating + 3.426 x Interstate Status

This model has an adjusted R-squared of 34.6%, which is another 1 percentage point higher than the model using Bookie_Prob_RE_trans alone. All of the coefficients are statistically significant at the 10% level or higher, and its mean absolute prediction error (MAPE) for the period 2006 to the present (Round 17 of 2013) is 28.8 points per game.

Its MAPE for the current season is 25.4 points per game, which is more than 2 points per game lower than the lowest all-season average, which was 27.8 in 2009. Still, however, we reject the null hypothesis that their is heteroskedasticity related to Season.

And, lastly, the residual plots for this final model, and the associated curvature diagnostics, suggest that this model is an acceptable one.


For me, the key elements of this analysis are:

  • The confirmation of the homoskedasticity of errors based on expected game margins derived from the best predictive models
  • The improvement in predictive accuracy and in the efficiency of coefficient estimation that's possible by using a simple, transformed version of the Bookmaker's Implicit Home Team probability
  • A reminder of the importance of and value in reviewing residual diagnostics when assessing the quality of linear models