Just a short post tonight while we wait for the serious footy to begin.
For this blog I've again called upon the services of Eureqa, this time to find for me equations that predict the final victory margin for the Home team (which might be negative or zero) purely as a function of the scores at the various quarter breaks.
As we've seen before, the pattern of VFL/AFL scoring has changed over the seasons, so it's unlikely that a single model will adequately explain the results across the entire span of history from 1897 to 2011. For this reason I've chosen to fit models only to the games from 1980 onwards, which represents a period of relative stability in scoring. As well, I've split the available sample roughly in half, allowing Eureqa to fit a model to only about 50% of the games, leaving the remaining 50% for the purposes of evaluating the fitted models.
Fitting a Model to the Quarter Time Score
The model seeking to predict the final margin as a function of the quarter time margin is the simplest model of all and fits the equation Home Team Victory Margin = f(Home Team Margin at Quarter Time).
I used two loss functions, absolute error and squared error, for fitting this equation, the better of which on the holdout sample was that for the squared error loss function.
Home Team Victory Margin = 6.42 + 1.413 x Home Team Margin at Quarter Time
Its mean absolute prediction error (MAPE) was 29.87 points per game on the holdout sample. One way of assessing this result is to compare it to the absolute prediction error that would have been achieved at the start of the game if the average Home team victory margin was predicted. That would produce an MAPE of 35.35 points per game for the holdout games, some 5.48 points higher than the model just described. So, knowledge solely of the quarter time score allows us to reduce our average absolute prediction error by just under a goal if we know nothing at all else about the contest.
One interesting feature of this equation is that it suggests a Home team trailing by fewer than 5 points at quarter time should still be expected to prevail.
Fitting a Model to the Half Time and Quarter Time Scores
Now we fit equations of the form Home Team Victory Margin = f(Home Team Margin at Quarter Time, Home Team Margin at Half Time, Change in Margin Between Half Time and Quarter Time).
That last variable in that specification allows us to assess the impact, if any, of momentum. If the eventual Home team victory margin depends not just on the margins at the breaks but also on the change in these margins, then this variable will make an appearance in the optimal result.
Again I fitted models using the squared error and absolute error loss functions, and in neither case did the momentum variable appear.
The better result was for the absolute error loss function and was:
Home Team Victory Margin = 4.4 + 1.3 x Home Team Margin at Half Time
The MAPE for this equation on the holdout games was 22.71 points per game, a further 7.16 point reduction from the result for the chosen quarter time model.
This equation suggests that a Home team trailing by 3 points or fewer at half time should still be favoured to win, absent any other knowledge.
Fitting a Model to the Quarter Time, Half Time and Three-Quarter Time Scores
Lastly we fit equations of the form Home Team Victory Margin = f(Home Team Margin at Quarter Time, Home Team Margin at Half Time, Home Team Margin at Three-Quarter Time, Change in Margin Between Half Time and Quarter Time, Change in Margin Between Three-Quarter Time and Quarter Time, Change in Margin Between Three-Quarter Time and Half Time).
This time we've three variables, the last three in the specification above, looking to capture the effects of momentum.
The optimal result this time came from the squared error loss function and was:
Home Team Victory Margin = 2.68 + Home Team Margin at Three-Quarter Time + 0.139 x (Home Team Margin at Three-Quarter Time - Home Team Margin at Quarter Time)
This specification yielded an MAPE on holdout games of 15.04 points per game, a further 7.67 points reduction compared to the chosen equation at half time.
Here, for the first time, we see some evidence for a momentum effect, but it isn't huge. For every 10 point change in the Home team margin between quarter time and three-quarter time - say from trailing by 3 points to leading by 7 points - the model predicts about a 1.4 point change in the final victory margin for the Home team. To provide some context for this, note that the average change in the Home team margin between the first and third quarters is +4.2 points per game, and the average absolute change in the margin between those two points in time is 20.2 points per game.
(The best model using the absolute error loss function had an MAPE only 0.03 points per game worse and did not involve a momentum variable.)
My take out from all this is that:
- Knowledge of the score at each break, alone allows quite a sizeable improvement in our ability to predict the final margin, even if we know nothing else about the contest. By the time we have the three-quarter time score we've more than halved the MAPE from 35.35 points per game to 15.04 points per game. Even by quarter time we've lopped almost a goal from the MAPE.
- Momentum effects, whereby the final result is a function not just of margins at quarter breaks but also a function of the change in these margins, are modest. None can be found when modelling the scores at quarter time and half time, and only a moderate effect can be found when modelling the three-quarter time score.