Winning and Losing Streaks and Their Effects on Team Scoring

The idea for this blog came in an e-mail from long-time Friend of MoS, Michael, who wondered if the absence of lengthy winning streaks by teams like Hawthorn in 2015 reflected some feature of the game of Australian Rules or of the AFL competition that rendered such streaks self-limiting in length.

We'll look at the historical record in a moment, but before then it's worth reviewing some of the probabilistic characteristics of streaks to get an understanding of their likelihood in an idealised context. In the table below I've recorded the probability of witnessing a winning streak of a specified length or longer in a 25-game season for a team with some fixed probability of victory in every game. Each row in the table is based on 100,000 simulations of a 25-game season for a team with some fixed probability in the range 10% to 90%.

From the table we can see, for example, that a team with a 75% probability of winning each game has about a 48% probability of recording at least one winning streak of 8 games or more during a 25-game season. The main insight for me in this table is that even extremely but feasibly dominant teams (ie those who might hope to be 60% favourites in every game they play) will only record winning streaks stretching to 8 games or more in about 13% of 25-game seasons.

Now if it is the case that winning streaks are self-limiting then we might imagine modelling this by reducing a team's probability of winning the N+1th game having won the last N, which would have the effect of reducing the probabilities shown here (at least for those streak lengths for which "winning fatigue" becomes a factor). We'd find, therefore, that longer streaks would be even less likely.

In reality, teams don't have a fixed probability of victory each week, so it's impossible to definitively assess whether such a phenomenon exists empirically. One thing we can do is construct a statistical model of empirical results and use it to explore the available evidence, which is what I'll do next.

THE DATA AND THE MODEL

A simple analysis of historical data looking for self-limiting streak behaviour would involve a calculation of the winning (and losing) rates of teams in the N+1th game having won (or lost) the last N. What's problematic about this approach is that it ignores the differing underlying abilities of the teams that make, by definition, teams that have recorded lengthy winning streaks more likely to extend them than to end them (and conversely for teams with lengthy losing sequences). We need, therefore, a way to separate the effects of a team's relative ability on streak behaviour from the effects of the streak alone. In other words, we need a method for estimating a team's pre-game victory probability.

For this purpose, for recent seasons we could use bookmaker odds, but I'd like to extend to analysis back to periods for which historical odds data doesn't exist, so I'll instead be using the team's pre-game MoSSBODS 2.0 Ratings. These Ratings, whilst not without their flaws, seem to do an acceptable job of explaining previous VFL/AFL results, as the error analyses in the blog just linked reveal.

In the first analysis I'll be using the entire VFL/AFL history and fitting a random forest to the data so that any non-linearities in the relationships can be effectively modelled. Specifically, the model I'll be fitting is a regression with a team's actual winning margin as the target variable and with the following regressors:

• Expected Winning Margin based on MoSSBODS - this is the projected margin of victory derived from the teams' pre-game MoSSBODS 2.0 Ratings and the estimated Venue Adjustment, converted from Scoring Shots into Points as described in the blog linked to above. This variable controls for the changes in a team's victory probability from game to game
• Era - a categorical variable taking on one of 12 values depending on the Year in which the game was played. The Eras are 1897-1910, 1911-1920, 1920-1930, ... , 2001-2010, 2011-2015. This variable controls for any effects that are broad and temporal in nature.
• Own Winning Streak - the number of consecutive victories prior to the current game, stretching across seasons if necessary, and broken by a loss or draw
• Opponent Winning Streak - defined identically, but for the opposition team
• Own Losing Streak - the number of consecutive losses prior to the current game, stretching across seasons if necessary, and broken by a win or draw
• Opponent Losing Streak - defined identically, but for the opposition team

Note that, for every game, a random selection was made to determine which team would be designated "Own" and which designated "Opponent".

Before we review the model results, let's look firstly at the streak data in the form of a cumulative distribution function (CDF). First up we'll summarise winning streaks, where we see that about 50% of teams enter a game with a streak of zero (as we'd expect) and that about 90% of teams enter a game with a streak of 3 wins or fewer, and 98% with a streak of 7 wins or fewer.

Next, let's review losing streaks, where we again find about 50% of teams enter a game with a 0 length losing streak, and a little less than 90% of teams enter a game with a streak of 3 losses or fewer, and 96% with a streak of 7 losses or fewer.

Further calculations reveal that the average winning streak across VFL/AFL history is 1.24 games, while the average losing streak is 1.51 games.

Once we've fitted the random forest to the data, one way of exploring the relationships it encapsulates is via a partial dependence plot, which provides an estimate of the marginal effect on the target variable (here, the actual game margin) of changing one of the regressors. The x-axis on these plots spans the range of the regressor, and the y-axis tracks the marginal effect measured in points scored.

The partial dependence plot for the model appears below and reveals that:

• For most of its range, as we'd hope, there's a linear relationship between the Expected Margin and the Actual Margin. In other words, we find that, as MoSSBODS' expected margin for the game increases so too does the actual margin (ideally, if the model is well-calibrated, on a point-for-point basis).
• In some Eras Actual Margins are, on average, slightly higher than we'd expect (most notably in the current Era where they are over 1.5 points per game higher), while in other Eras they are slightly lower.
• For the most-common Own Losing Streak lengths - that is, those of length 4 or less - longer losing streaks imply lower scoring in the current game relative to the average team. For example, teams entering a game with a 4-game losing record, on average, score 4 points less than an average team. Note that teams with a 0-game or a 1-game losing streak tend to score about equally. The second consecutive loss has a relatively large, detrimental effect on scoring in the following game, the third consecutive loss only a small effect, and the fourth a relatively large effect again. This pattern might be partly a reflection of the somewhat home then away then home again nature of successive games in the competition schedule.
• For the most-common Opponent Losing Streak lengths - again, those of length 4 or less - longer losing streaks imply higher scoring in the current game. Broadly speaking, the relationship here is the mirror image of that for Own Losing Streak lengths.
• For the most-common Opponent Winning Streak lengths, longer winning streaks imply lower scoring, on average. When facing an opponent that did not win its last game, a team can expect to score about 2 points more than on average, and this expectation drops by about 1 point for every additional game in the opponent's winning streak.
• For the most-common Own Winning Streak lengths, longer winning streaks imply higher scoring, on average. Teams that did not win their last game perform slightly less well than an average team, while teams that have a winning streak of just 1 game perform only slightly better than an average team. Extending the winning streak to 2 games has quite a large effect, adding about 2 points to a team's expected score in the next game.

There are a couple of other features of this chart worth noting. Firstly, the notches at the bottom of the plots for the continuous regressors reflect deciles for that regressor. In other words, the leftmost notch is at the value below which only 10% of cases fall, the next notch at the value below which only 20% of cases fall, and so on. Because streaks are whole numbers and many observations have the same value, we don't see 9 notches in each plot, but the main thing to recognise is that the rightmost notch reflects the value above which only 10% of cases lie.

Secondly, note that the ordering of the regressors in the chart reflects the relative importance of each. The raw node purity variable importance values are shown in the chart below, which suggests that the Expected Margin has by far the greatest importance, that Era is next most important, and that the four other regressors are of roughly equal and somewhat lesser importance.

So, this all-time model suggests that a team's losing and winning streaks are, to a point, self-reinforcing in that teams with winning streaks of 2 to 4 games tend to do better than average teams and that teams with losing streaks of 2 to 4 games tend to do worse than average teams. Outside that range the relationship is less straightforward though, with slightly longer losing streaks having less of a depressing effect on future scoring (though still a negative effect) and with slightly longer winning streaks having less of a positive effect on future scoring (though still a positive effect). There is then, some evidence for a mildly self-limiting effect of extended streaks, but only in the sense described here in that the impact on future scoring tends towards zero. Another way of putting this is that, as the streak extends, the team's probability of victory returns closer to what we'd expect it to be based solely on its ability relative to its opponent.

The influence of opponents' winning and losing streaks appears to be more linear in nature, with longer losing streaks portending higher scoring and longer winning streaks portending the opposite.

MODELLING RECENT HISTORY

In this final section I'll fit the same model and perform the same analysis, but only for games in the period since 1985.

Across this period, average losing streaks (1.31 vs 1.51) and average winning streaks (1.20 vs 1.24) have been shorter, with over 90% of winning streaks now of length 3 games or fewer and just under 90% of losing streaks in this same range.

The partial dependence plot for the 1985 to 2015 dataset is very similar in shape and scale for the Expected Margin, Opponent Losing and Opponent Winning Streak variables. The plot for the Own Losing Streak variable is also broadly similar in shape although suggesting a larger effect on scoring for streaks of moderate length.

Apparently more different, however, is the partial relationship between Own Winning Streak and Actual Margin, though a careful comparison reveals that the main difference is the absence of a more pronounced negative effect of very long winning streaks, which are, in any case, rare.

It's also worth noting that the Own Winning Streak variable remains the least important one, albeit only narrowly.

SUMMARY

Regardless of whether we adopt an all-time or a 1985-to-2015-only perspective, the broad conclusions are the same: a team's winning and losing streaks are mostly self-reinforcing for the shorter length streaks most-commonly encountered, and become self-limiting (or, perhaps more correctly, less self-reinforcing) as they reach atypically longer lengths.

Opponents' winning and losing streak lengths also have an influence, with longer winning sequences dampening a team's scoring and longer losing sequences having the opposite effect.

In all cases, however, the effects of streak lengths are small, mostly altering a team's scoring prospects by no more than a few points. To give that effect size some perspective, a team moving from a +0 to a +3 point expected victory margin (assuming Normally distributed margins with a standard deviation of 36 points) increases its victory probability from 50% to only about 53%.