Recently, I noted, somewhat in passing in this piece on close game and blowouts, the decline in overall team scoring, a topic that's receiving not a little attention within the football community at the moment, fuelled partly by some recent low-scoring games, in particular the Dees v Lions encounter.
This decline in overall scoring has not, taking the longer view, come at the expense of a blowout in victory margins, but instead with a decline in both average winning and average losing scores.
While mulling this over I wondered if there might be a link between scoring metrics and the likelihood of underdogs prevailing. Specifically, my hypothesis was that lower winning scores might be associated with higher rates of underdog victory because, in such games, underdogs would need to have (or to have had) fewer scoring "incidents" fall unexpectedly in their favour to tilt the result from a loss to a win.
Put very simplistically, a coin biased 60:40 in favour of heads is more likely to produce a majority of tails if you conduct your experiment over fewer tosses rather than more.
To test the hypothesis we'll review the data from all games during the period commencing in Round 1 of 2006 and ending in Round 16 of 2015, using the TAB head-to-head market data that I've collected for that period to determine which team was the underdog and to what degree.
For this first analysis we'll tabulate the winning rate of underdogs for groups of games defined by:
- the underdog team's levels of underdoggedness (we've discussed this before - if favourites enjoy favouritism, what do underdogs endure?)
- the winning score in the game
The class boundaries used for the table were defined in such a way that each column sums to about 180 games and each row to about 450 games. No cell has fewer than 29 games and none has more than 75 games.
The first cell, for example, informs us that underdogs whose per-game prices in the head-to-head market implied they were less than 20% chances of winning, won 31% of the time when the winning score was under 80 points. In very low-scoring games therefore, these underdogs won at a considerably higher rate than their pre-game odds would have indicated.
As we move down the rows a general narrative emerges: underdogs fare better the lower the winning score; for any given range of winning scores, less-unfancied underdogs win more often than more-unfancied underdogs.
Very roughly speaking, it looks as though
- underdogs with pre-game chances of 20% or less win more often than this pre-game assessment indicates in games where the winning team score is under 80 points
- underdogs with pre-game chances of 20% to 30% win more often than this pre-game assessment indicates in games where the winning team score is under 103 points
- underdogs with pre-game chances of 30% to 40% win more often than this pre-game assessment indicates in games where the winning team score is under 127 points
- underdogs with pre-game chances of greater than 40% win more often than this pre-game assessment indicates in games where the winning team score is under 140 points
This data then seems to support the contention that underdog success is associated with lower winning team scores. Note that we can't make definitive claims about the direction of any causal link solely on the basis of this data - we can't say, for example, that rank underdogs should seek to make their games low-scoring in order to bolster their chances of winning, as it might just be that low scores for such teams are a by-product rather than a cause of their infrequent success. Still, as Randall Monroe notes in the alt-text for XKCD cartoon #552:
"Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'."
A STATISTICAL MODEL (OR TWO)
Though the data in the earlier table seems compelling, absent a formal statistical approach we can't be entirely certain that the relationships we've spotted couldn't be merely the result of chance.
We can test whether this might be the case, formally, by building a statistical model, specifically a binary logit model of the form:
ln(Pr(Underdog Wins)/(1-Pr(Underdog Wins)) = a + b x Underdog Probability + c x Winning Score + d x Underdog Probability x Winning Score
Including the interaction term between Underdog Probability and Winning Score allows for the possibility that the underdog's log odds might not respond purely linearly to Underdog Probability and to Winning Score.
The fitted model has the following coefficients (with p-values, in brackets):
- a = 2.805 (0.001)
- b = -6.087 (0.01)
- c = -0.051 (Tiny)
- d = 0.110 (Tiny)
A little maths (designating the log odds by V, the Underdog Probability by U, and the Winning Score by W) gives us that:
- dV/dU = b + d x W = -6.09 + 0.11 x W, which is positive for all W greater than about 55
- dV/dW = c + d x U = -0.051 + 0.11 x U, which is negative for all U less than about 46%
The first result tells us that, as we'd expect, underdogs' victory chances tend to increase as their pre-game implicit probability increases (except for those rare games where the winning score is below 55 - there were only 10 such games in the sample). The second tells us that underdogs' victory chances tend to decrease as Winning Scores increase, except for underdogs that are nearly equal-favourites; it therefore confirms our original hypothesis except for a smallish subset of underdogs.
As a final check and as a way to more flexibly cater for any non-linearities in the data relationships, yet still disentangle the univariate effects, I fitted a randomForest to the data and then created partial dependence plots for the "Underdog Wins" class. These reveal that the underlying main effects are as we found above: a positive relationship between underdog winning rates and their pre-game probability, and a negative relationship between underdog winning rates and winning scores.
SUMMARY and CONCLUSION
It seems fairly clear from the data that underdogs generally do better in games where the winning score is lower, though further thinking and analysis is required to tease out the direction in which causality runs.
If it's true that lower winning scores are a cause of greater underdog success however, then it might be that the trend to lower winning scores reflects the successful implementation of strategies by weaker teams to limit the scoring opportunities of their opponents (rather than strategies designed to expand their own opportunities) and, in so doing, bolster their chances of victory.
So, if you love seeing underdogs win, especially those with the slimmest of pre-game chances, you might need to put up with lower-scoring contests.