In the previous blog here on Statistical Analysis I referred to this paper and applied its drift-free Random Walk model to the "safety" of leads recent AFL history, finding that, to some extent, it fitted empirical data well. Specifically, we found that in those instances where the model assessed a lead as being more than 50% "safe" given its size and the time remaining, its probability assessments were well-calibrated but, in other games, where the lead was assessed as being less than 50% "safe", these leads were actually safer than they were assessed as being. We surmised that such problematic games were likely to include a disproportionate share of contests between unequally talented teams where the stronger team led and surrendered that lead less often then they might against better-matched opposition.

In that same paper, the drift-free Random Walk model was also used to explore a variety of aspects about leads and lead changes, including:

• How many lead changes might be expected in each game
• When the "maximal" (ie largest) lead might be expected to occur
• When the final lead change might be expected to occur

I'm going to investigate all three of these phenomena in the context of AFL in this blog, drawing again on data from the games between R1 of 2008 and R23 of 2014.

The paper derives a distribution for the number of lead changes in a game between equally-matched opponents, which I've replicated below and in which:

• N is the expected number of Scoring Events in a contest, which we've previously estimated as about 50.2 for AFL.
• p is the probability of a team scoring next having scored last, which we've also estimated as being about 54%.
• m is the number of lead changes in a game, defined as the number of times that the margin reaches or crosses zero (excluding the game start where the score is 0-0). So, for example, if the home team leads by 6 and then the game is tied, that's 1 lead change, though the next score, which will result in a home team or an away team lead, does not result in an additional lead change since it does not cause the game margin to equal or cross zero again.

Using the definition of lead change just provided, the table at right provides information about the number of lead changes in games where the specified team was the Home Team. Richmond, for example. played in 19 games where the lead never changed - in other words, the team that scored first never found itself level or headed. On average, across the 77 games in which it was the Home team, there were 4 lead changes, which is more lead changes than for any other Home team.

Overall, a typical contest during the period saw 3.3 lead changes, with the greatest number of lead changes in any single game the 19 lead changes in a Sydney v Essendon contest of 2010, eventually won 89-80 by the Swans.

About 19% of all contests were won by the team that scored first, and almost 50% of all contests saw 2 or fewer lead changes. Only about 10% of games had 8 or more lead changes.

The profile of lead changes, provided in the row labelled "% of games", is quite different from the profile suggested by the model equation, which is shown in the row labelled "Model %", and which uses the values of N and p provided earlier. Compared to the model equation, we find more contests with fewer lead changes than we'd expect, which again suggests that the equally-talented teams assumption is problematic, and the prevalence of contests between mismatched teams is distorting the distribution towards fewer lead changes.

Looking at the same data, but summarising it now by the Away team in each contest, we find that the Dons now lead the competition in terms of average lead changes per game, they participating in games as the Away team with an average of 3.91 lead changes per game.

The Roos and Blues, who sit in 2nd and 3rd in this table. are especially interesting in that they're the only teams to have played in fewer than 10 contests where the lead never changed after the first score.

Combining teams' Home and Away performances reveals that Richmond, narrowly, are the team involved in games with the largest average number of lead changes, fractionally ahead of the Kangaroos, and slightly more ahead of Hawthorn and Essendon.

Conversely, GWS have played in games where lead changes have been rarest, a typical game of theirs involving only 2.3 lead changes. Melbourne is the only other team to participate in games where the average number of lead changes is less than 3.

A few teams are especially interesting in that their Home and Away games tend to produce vastly different numbers of lead changes, in particular Essendon, Carlton and Geelong, who save their see-sawing games for when they're on the road, and Richmond, Fremantle and Melbourne, who tend to be better lead-sharers at home than away.

It seems reasonable to assume that teams playing in games with fewer lead changes will be either relatively strong or relatively weak, and that those playing in games with a greater number of lead changes will be of more average ability.

To investigate that hypothesis, I've calculated the home, away and combined winning percentages for each of the teams. The results are summarised in the table at left.

We find some support for the hypothesis, since the teams with the three lowest average number of lead changes per game also have the lowest Combined winning percentages, and the teams with the three next-lowest average number of lead changes per game have the 4th highest, 6th lowest, and 5th highest winning percentages.

But, some teams are anomalous. Richmond, for example, have a winning percentage not all that different from West Coast's, but have almost 17% more lead changes per game than do the Eagles.

Hawthorn, even more dramatically, have the 2nd best winning percentage, but still manage to generate the 3rd highest average lead changes metric.

The paper also derives a distribution for the time at which we might expect the "maximal" or largest lead to be established. Though the paper does not reveals how it handles the issue of multiple leads of the same size at different points in time, I've taken the latest time as the time of the largest lead in such instances.

Given that treatment I find that, as the paper suggests, the distribution of times of the largest lead follows, roughly, a U-shape distribution, though for our AFL data the largest lead is more likely to come in the latest parts of the game rather than in the earliest parts. This asymmetry is probably, again, due to the prevalence of contests between mismatched teams where the stronger team runs away with the contest in the final minutes.

Summing the relevant rows of this table we can say that the largest lead comes:

• In the first 25% of the game (roughly, sometime in Q1) about 27% of the time
• In the second 25% of the game about 14% of the time
• In the third 25% of the game about 13% of the time
• In the final 25% of the game about 46% of the time
• In the first 50% of the game (roughly, the 1st Half),41% of the time
• In the second 50% of the game, 59% of the time

(Now there's a basis for a lucrative proposition bet.)

### TIME OF FINAL LEAD CHANGE

One of the more surprising findings of the paper is that the distribution of times for the final lead change is the same as the distribution of times for the largest lead to have been established.

When we build the empirical distribution of the time of the final lead change for AFL, however, we find that it's quite different from the empirical distribution of the time of the largest lead. It's almost a mirror-image, in fact. We find that the final lead change in a game occurs in the first 5% of the contest in more than 25% of games, but that the final lead change occurs in the last 5% in only 6% of games. Yet again, this is probably the result of mismatched opponents.

Summing the rows in this table we can say that the final lead change occurs:

• In the first 25% of the game (roughly, sometime in Q1) about 46% of the time
• In the second 25% of the game about 17% of the time
• In the third 25% of the game about 14% of the time
• In the final 25% of the game about 23% of the time
• In the first 50% of the game (roughly, the 1st Half), 63% of the time
• In the second 50% of the game, 37% of the time

We see then, that it's relatively rare for the final lead change in a contest to come in the last 25% of the contest - it happens about 23% of the time - but some teams are more associated with the phenomena than are others, as the table below reveals.

The Brisbane Lions, for example, during the period were involved in games where the final lead change happened in the last 25% of the game at a rate about 18% higher than the all-team average (ie 26.8% vs 22.7%). Playing at home, their record was even more dramatic - there they generated final lead changes in the last quarter of the game almost 35% of the time, which is about a 50% higher rate than the all-team average.

For this statistic too we might hypothesise as we did earlier that the weakest and strongest teams would be least likely to swap leads later in their contests.

The Lions, however, are a striking counterexample, their ability to swap leads late in a contest not at all in keeping with their 5th-worst ranking on winning percentage.

That said, we again find that the teams ranked in the bottom three positions on this metric also have the three worst winning percentages.

### SUMMARY AND CONCLUSION

In recent years, the AFL has provided games where the lead has changed, on average, 3.3 times during the course of a game, but also where the team that's scored first has won almost 19% of the time, and where the number of lead changes has been 2 or fewer almost 50% of the time.

Games in which the lead has changed at least once in the final quarter have been relatively rare, occurring only about 25% of the time, while about the same proportion of games have seen the final lead change transpire in the first 5% of the contest.

Also, the largest leads have tended to come later in games, as late as the last 5% of the contest in almost 1 game in 4.

All of these statistics are broadly inconsistent with the notion that games of AFL can be appropriately modelled as Random Walks with zero drift - that is, Random Walks played between teams of roughly equal abilities.

Any application of a Random Walk style approach must take into account the prevalence of mismatched teams or, in the context of the paper on which this blog is based, must cater for the fact that the bias velocity in many AFL games is non-zero.

Comment