Sources of Surprisal : 2006 to 2014 Round 6

It's been quite a year for upsets in the AFL so far. One of the ways of quantifying just how surprising these results have been is to use surprisals, about which I've written previously on a number of occasions (see, for example, this blog on the link between surprisals and probability scores).

Briefly, the greater the level of surprisals associated with a result, the more surprising it was, since the smaller was the probability associated with the ultimate victor pre-game.

The chart at right shows the relationship between surprisals and the winning team's probability.

Historically, between 2006 and Round 6 of 2014, the results for about 80% of games have been associated with surprisals in the range 0.2 to 1.5. The largest surprisals figure for a single AFL game during that period was 4.46 bits, recorded in the last home-and-away round of 2012 when Gold Coast at $14 beat Carlton at $1.02. The smallest single-game surprisal figure was 0.04 bits, recorded when West Coast at $1.01 defeated the GWS at $21 in Round 7 of 2012. (Note that probabilities have been inferred from TAB Bookmaker prices throughout this blog by using the Risk-Equalising approach.)


So, to estimate just how unusual the first six rounds of 2014 have been using a surprisals approach, let's look at the round-by-round data for the entire period since Round 1 of 2006.

We see that this season's average of 0.96 bits of surprisal per game is quite high. In fact, were the season to continue to produce surprisals at this rate, 2014 would be comfortably the most surprising season of those since and including 2006. Last week's 0.92 bits per game is the highest we've seen in a Round 6 since 2007, and the week before's 1.13 bits per game is the highest for a Round 5 since 2010 and the second-highest for that Round for the whole span of history we're considering. Clearly, this year's start ranks amongst the most surprising we've seen in recent history.

Looking down the column headed All to see what might lie ahead, however, it seems that surprisals tend to decline somewhat on average as the season progresses, Finals and a few late-season spikes aside.

To investigate this hypothesis, here's the surprisal data summarised after grouping the rounds. The downward trend as seasons progress is evident, if slight and even then only true at the aggregate level. There have been a number of seasons where the average surprisals from Rounds 7 to 11 have exceeded those from Rounds 1 to 6.

Still, on balance, we'd expect the level of surprisal-generation to moderate in future weeks of this season, though that's by no means certain.


Another interesting way to slice up this historical surprisal data is by venue, to answer the question: at which ground have results been hardest to predict (ie most surprising, on average)?

The answer, surprisingly enough, at least amongst venues that have been used reasonably regularly since 2006, is Football Park, a venue that is not being used this season.

The MCG has been responsible for the next-highest average level of surprisals, although the difference between the average for this venue and those for Docklands, Subiaco, the Gabba, Stadium Australia and the SCG is not large.

Kardinia Park, by quite a long way, is the venue with the lowest average surprisals per game. That's what you get when a short-priced team keeps on winning at home.


To finish, let's review the average level of surprisals associated with the 18 teams.

Firstly, we'll cross-tabulate the data by team by season. From this table we can claim that Essendon are the team whose games have been the most difficult to predict across the period from 2006 to the present. Their all-season average has been elevated by especially high averages in the current part season as well as in the full seasons of 2007, 2009 and 2010. In fact, the 1.15 bits or surprisal per game they generated in 2009 is the highest single-season result for any team.

Carlton have been another team to produce spectacular single-season results, their 2012 and 2008 figures the second- and the third-largest of all teams across the period. Their all-season average is lowered only by some particularly low averages for 2006, 2007 and 2011.

At the other end of the scale, GWS have been the team producing the most predictable full-season results. They own the lowest and the second-lowest average surprisal figures of 0.21 and 0.34 bits respectively.

Other low results have been Melbourne's 0.37 in 2013 and Collingwood's 0.38 in 2011. These two results highlight the two alternative methods for generating low average surprisal figures. For the Dees this was achieved by being consistent, rank underdogs and losing, while for the Pies it was achieved by being consistent, short-priced favourites and winning. Both teams were therefore associated with games where the results tended to go to script, but it was only the Dees who were seeking another playwright.

One way to analyse the source of a team's high or low average surprisal figure is to separately calculate the average for games that the team won and compare it to those for games that the team lost, and to consider the relative proportion of games won and lost. Looking at the all-season results for these same two teams we see that Collingwood's 0.84 bits all-season average is produced by a combination of a very low average when they've won (0.64 bits per game, signifying strong favouritism), which they've done often, and a very high average when they've lost (1.23 bits per game, signifying that this result was surprising), which they've done infrequently.

For Melbourne we find instead a high average when they've, occasionally, won (1.14 bits and hence surprising) and a very low average when they've, all too frequently, lost (0.46 bits and hence not surprising at all).

In the last table, we'll look at the average surprisals produced per game by each of the teams at different periods of the season.

Finals aside, we see that most teams follow the all-season trend we discussed earlier in that their results become more predictable as the season progresses. The more notable exceptions to this rule have been Port Adelaide, who've been most predictable in the first part of the second half of seasons; Richmond, who've started out each season being at their predictable best; the Roos, the Pies and the Eagles, who've been hardest to predict when Port Adelaide have been easiest; the Lions, who've been most unpredictable in the latter portions of the first half of seasons; and the Cats who've been most predictable when the Lions have been least predictable.


Surprisals, applied to victory probabilities derived by pre-game Bookmaker prices, continue to provide a useful method of objectively quantifying the difficulty associated with predicting sets of AFL game results, be they summarised by round, groups of rounds, venues, teams, or combinations of these.

Based on this method of measurement, at this stage of 2014 we might be headed for the most surprising season in recent memory.