Assessing Team Abilities: Scoring Shots or Final Score?

In the previous post here on the Statistical Analyses blog we revisited the topic of Scoring Shot conversion and found that it appears to be unpredictable at a team level across entire seasons. That result, coupled with an earlier one where we found conversion rates to be unpredictable for a team in a particular game (and with some conversations I've had on Twitter subsequent to the more-recent analysis) makes it hard to reject the null hypothesis that team conversion rates are generated in a manner that's indistinguishable from a random variable.

Today I want to draw out what this implies for assessing the relative ability of teams based on results.

To do this, I'll be returning to the model of team scoring that I developed and fitted empirically back in August 2014, which is summarised in the set of equations at right.

Essentially, it says that teams enter contests with their relative ability summarised in the number of scoring shots that they and their opponents are expected to produce. The actual number of scoring shots witnessed in the game are then random variables centred around these expected scoring shot values (and correlated such that, should one team register more scoring shots than expected, the other is likely to register fewer). These scoring shots are then converted into goals with some probability centred on 53% but varying from shot-to-shot around this average. 

I'm going to use this model to generate scores for teams of varying relative abilities and ask a simple question: what's more likely, that the better team registers more points or that the better team registers more scoring shots?

Generating simulations using this scoring model requires only two inputs: the expected number of scoring shots for each of the teams (there are other parameters in the underlying model, but I'm going to use the fitted values of these parameters, which are the values that you see in the equations above. This means that I'm implicitly assuming the home team is the favourite throughout the simulations that follow, but the results are unlikely to be sensitive to this assumption).

We'll consider a range of scoring shot inputs for the Favourite or "better" team, which we'll consider to be the team expected to generate more scoring shots, and for the Underdog.

The results for 10,000 replicates of each pair of {Favourite expected scoring shots, Underdog expected scoring shots} are summarised in the heat map below. Each entry provides the difference between the probability that the Favourite registers more scoring shots and the probability that the Favourite wins (in both cases with draws counted as half-wins).

So, for example, the +0.1% figure in the upper left results from the fact that, across the 10,000 replicates with the Favourite having 40 expected scoring shots to the Underdog's 15, the Favourite won 99.8% of the time but registered more scoring shots 99.95% of the time.

The key finding from this analysis is that every cell in the chart above is positive, which means that, regardless of the difference in ability between the Favourite and the Underdog (measured in terms of expected scoring shots), the Favourite is always more likely to register more scoring shots than the Underdogs, than it is to register more points than the Underdogs. 

Put another way, in general we get a better idea of the relative abilities of teams by looking at whether or not they generate more scoring shots than their opponents than we do by looking at whether or not they outscore them. This, I think, is one of the reasons why MoSSBODS - which uses only scoring shots and not scores - appears to be an effective Team Rating System.

The chart also reveals that the probability differential is largest (in percentage point terms) when the Favourite is expected to register about 5 to 10 more scoring shots than the Underdog. When the difference is much larger than this, the Favourite's probability of outscoring the Underdog becomes so close to 100% that little room remains for a scoring shot approach to more reliably reflect the team's relative abilities.

When the difference in expected scoring shot production is much smaller, the variability in the scoring shot differential becomes larger, making it a less reliable indicator of the teams' relative abilities - though still a more reliable indicator than the final score. 


The simple conclusion from this analysis is that, assuming you accept the empirical scoring model is a reasonable description of the real-world process that generates AFL results, more information is revealed about team abilities by their relative scoring shot performance than by whether or not they won or lost.

That's not to say that there have never been games where a team registering more scoring shots than its opponent has not, palpably, been the less able team on the day, only that, in the long-term, better teams will more often register more scoring shots than their weaker opponents than they will actually defeat them on the scoreboard. The difference is not huge - generally it's about 2 to 4% points - but it is universally positive.