November 04, 2016

Strength of Schedule Assessments: A Quick Numerical Comparison

November 04, 2016/ Tony Corke

With FMI today posting its assessment of the 2017 AFL draw, we now have (at least) the following comparable analyses:

October 27, 2016

The 2017 AFL Draw: Difficulty and Distortion Dissected

October 27, 2016/ Tony Corke

I've seen it written that the best blog posts are self-contained. But as this is the third year in a row where I've used essentially the same methodology for analysing the AFL draw for the upcoming season, I'm not going to repeat the methodological details here. Instead, I'll politely refer you to this post from last year, and, probably more relevantly, this one from the year before if you're curious about that kind of thing. Call me lazy - but at least this year you're getting the blog post in October rather than in November or December.

September 29, 2016

Classifying Grand Finals (A Reprise)

September 29, 2016/ Tony Corke

(This piece originally appeared in the Guardian, and revisits the topic of defining a typology for Grand Finals, which I first looked at in 2009 where I came up with a similar solution, and again in 2014 where I used a fuzzy clustering approach.)

For fans, even casual ones, AFL Grand Finals are special, and each etches its own unique, defining legacy on the collective football memory.

September 08, 2016

What Makes Finals Different from Games in the Home and Away Season?

September 08, 2016/ Tony Corke

This week we’ll be entering what promises to be one of the most closely-contested Finals series in recent years. If you believe the bookmakers’ assessments, each of the top four teams have at least a 15% chance of snaring the Flag, and Adelaide and West Coast both have chances of around half that.

September 02, 2016

The Finalists of 2016: A Recent Historical Perspective

September 02, 2016/ Tony Corke

With a week to go before the Finals commence, what better way to spend some of that time than reviewing this year's crop of finalists and comparing them to those of recent years?

July 16, 2016

What Proportion of Close Games Should the Better Team Win?

July 16, 2016/ Tony Corke

These week there's been a lot of talk about Hawthorn and their ability to "win the close ones", one narrative being that they are somehow able to do this more often than they "should" given that they're 5 and 0 in games finishing with a margin of under a goal this season.

July 15, 2016

Team Ratings and Conversion Rates

July 15, 2016/ Tony Corke

A number of blog posts here in the Statistical Analysis portion of the MoS website have reviewed the rates at which teams have converted Scoring Shots into goals - a metric I refer to as the "Conversion Rate".

In this post from 2014 for example, which is probably the post most similar in intent to today's, I used Beta regression to model team conversion rates:

as a function of venue, and the participating teams' pre-game bookmaker odds, venue experience, MARS Ratings, and recent conversion performance.
as a function of which teams were playing

Both models explained about 2.5 - 3% of the variability in team conversion rates, but the general absence of statistically significant coefficients in the first model meant that only tentative conclusions could be drawn from it. And, whilst some teams had statistically significant coefficients in the second model, its ongoing usefulness was dependent on an assumption that these team-by-team effects would persist across a reasonable portion of the future. We know, however, that teams go through phases of above- and below-average conversion rates, so that assumption seems dubious.

Other analyses have revealed that stronger teams generally convert at higher rates when playing weaker teams, so it's curious that the first model in that 2014 post did not have statistically significant coefficients on the MARS Ratings variable.

Maybe MoSSBODS, which provides separate offensive and defensive ratings, might help.

THE MODEL

For today's analysis we will again be employing a Beta regression (though this time with a logit link and not fitting phi as a function of the covariates), applying it to all games from the period from Round 1 of 2000 to Round 16 of 2016.

We'll use as regressors:

A team's pre-game Offensive and Defensive MoSSBODS Ratings
Their opponent's pre-game Offensive and Defensive MoSSBODS Ratings
The game venue
The (local) time of day when the game started
The month in which the game was played
The attendance at the game

(Note that the attendance and time-of-day data has been sourced from the extraordinary www.afltables.com site.)

Now, in recent conversations I've been having on Twitter and elsewhere people have been positing that:

better teams will, on average, create better scoring shot opportunities and so will convert at higher rates than weaker teams. In particular, teams with stronger attacks playing teams with weaker defences should show heightened rates of conversion.
dew and/or wet weather will generally depress scoring, partly because it will be harder to create better scoring opportunities in the first place, and also because any opportunity will be harder to convert than it would be from the same part of the ground were the weather more conducive to long and accurate kicking.

What's appealing about using including MoSSBODS ratings as regressors is that they allow us to explicitly consider the first argument above. If that contention is true. we'd expect to see a positive and significant coefficient on a team's own Offensive rating and a negative and significant coefficient on a team's opponent's Defensive rating.

On the second argument, whilst I don't have direct weather data for every game and so cannot reflect the presence or absence of rain, I can proxy for the likelihood of dew in the regression by including the variables related to the time of day that the game started and the month in which it was played.

Looking at the remaining regresors, venue is included based on an earlier analyses that suggested conversion rates varied significantly around the all-ground average for some venues, and attendance is included to test the hypothesis that teams may respond positively or negatively in their conversion behaviour in the presence of larger- or smaller-than-average crowds.

THE RESULTS

Details of the fitted mode appear below.

The logit formulation makes coefficient interpretation slightly tricky. We need firstly to recognise that estimates are relative to a notional "reference game", which for the model as formulated is a game played at the MCG, starting before 4:30pm and played in April.

The intercept coefficient of the model tells us that such a game, played between two teams with MoSSBODS Offensive and Defensive ratings of 0 (ie 'average' teams) would be expected to produce Conversion rates of 53.1% for both teams. We calculate that as 1/(1+exp(-0.126)).

(Strictly, we should include some value for Attendance in this calculation, but the coefficient is so small that it makes no practical difference in our estimate whether we do or don't.)

Next, let's consider the four coefficients reflecting MoSSBODS ratings variables. We find, as hypothesised, that the coefficient for a team's own Offensive rating is positive and significant, and that for their opponent's Defensive rating is negative and significant.

Their size means that, for example, a team with a +1 Scoring Shot (SS) Offensive rating and a 0 SS Defensive rating playing a team with a 0 SS Defensive and Offensive rating would be expected to convert at 53.3%, which is just 0.2% higher than the rate in the 'reference game'. This is calculated as 1/1(1+exp(0.126+0.008)).

Strong Offensive teams will have ratings of +5 SS or even higher, in which case the estimated conversion rate would rise to just over 65%.

Similarly, a team facing an opponent with a +1 Scoring Shot (SS) Defensive rating and a 0 SS Offensive rating, itself having 0 SS Defensive and Offensive ratings would be expected to convert at 52.8%, which is about 0.3% higher than the rate for the 'reference game'.

The positive and statistically significant coefficient on a team's opponent's Offensive rating is a curious result. It suggests that teams convert at a higher rate themselves when facing an opposition with a stronger Offence.as compared to one with a weaker Offence. That opponent would, of course, be expected to convert at a higher-than-average rate itself, all other things being equal, so perhaps it's the case that teams themselves strive to create better scoring shot opportunities when faced with an Offensively more capable team, looking to convert less promising near-goal opportunities into better ones before taking a shot at goal.

In any case, the coefficient is only 0.004, about half the size of the coefficient on a team's own Offensive rating, and about one-third the size of that on the team's opponent's Defensive Rating, so the magnitude of the effect is relatively small.

To the venue-based variables then, where we see that three grounds have statistically significant coefficients. In absolute terms, Cazaly's Stadium's is largest, and negative, and we would expect a game played there between two 'average' teams, starting before 4:30pm in April to result in conversion rates of around 46%.

Docklands has the largest positive coefficient and there we would expect a game played between the same two teams at the same time to yield conversion rates of around 56%.

The coefficients on the Time of Day variables very much support the hypothesis that games starting later tend to have lower conversion rates. For example, a game starting between 4:30pm and 7:30pm played between 'average' teams at the MCG would be expected to produce conversion rates of just over 52%. A later-starting game would be expected to produce a fractionally lower conversion rate.

Month, it transpires, is also strongly associated with variability in conversion rates, with games played in any of the months May to August expected to produce higher conversion rates than those played in April. A game between 'average' teams, at the MCG, starting before 4:30pm and taking place in any of those months would be expected to produce conversion rates of around 54%, which is almost 1% point higher than would be expected for the same game in April. The Month variable then does not seem to be proxying for poorer weather.

Relatively few games in the sample were played in March (150) so, for the most part, April games were the first few games of the season. As such, the higher rates of conversion in other months might simply reflect an overall improvement in the quality and conversion of scoring shot opportunities once teams have settled into the new season.

Lastly, it turns out that attendance levels have virtually no effect on team conversion rates.

SUMMARY

It's important to interpret all of these results in the context of the model's pseudo R-squared, which is, again, around 2.5%. That means the vast majority of the variability in teams' conversion rates is unexplained by anything in the model (and, I would contend, potentially unexplainable pre-game). Any conversion rate forecasts from the model will therefore have very large error bounds. That's the nature of a measure as lumpy and variable as Conversion Rate, which can move by tens of percentage points in a single game on the basis of a few behinds becoming goals or vice versa.

That said, we have detected some fairly clear "signals" and can reasonably claim that conversion rates are:

Positively associated with a team's Offensive rating
Negatively associated with a team's opponent's Defensive rating
Positively associated with a team's opponent's Offensive rating
Higher (compared to the MCG) at Docklands, and lower at Cazaly's Stadium and Carrara
Lower for games starting at 4:30pm or later compared to games starting before then
Higher (relative to April) for games played between May and August
Unrelated to attendance

Taken across a large enough sample of games, it's clear that these effects do become manifest, and that they are large enough, despite the vast sea of randomness they are diluted in, to produce detectable differences.

Next year I might see if they're large enough to improve MoSSBODS score projections because, ultimately, what matters most is if the associations we find prove to be predicitively useful.

July 07, 2016

Improving MoSSBODS' Team and Total Predictions

July 07, 2016/ Tony Corke

It's rare - I think unprecedented - for me to make changes to any of the Fund algorithms during the course of a season

June 19, 2016

Goal-Kicking Accuracy After Wins and Losses : A Footnote

June 19, 2016/ Tony Corke

In the previous blog we investigated the patterns of teams' Scoring Shot conversion rates in consecutive home-and-away games, observing that teams tend to convert at higher rates after a loss, and lower rates after a win.

June 18, 2016

Goal-Kicking Accuracy After Wins and Losses

June 18, 2016/ Tony Corke

Lately I've been thinking a lot about the predictability of teams' conversion rates - that is, the proportion of their Scoring Shots that they turn into goals.

June 01, 2016

Quantifying Imbalances in the AFL Draw Across Recent History

June 01, 2016/ Tony Corke

More and more often now, I'm being offered interesting suggestions for analyses by followers on Twitter, and today's blog is another example of this.

May 16, 2016

Who's Your Team's "Bogey Team"?

May 16, 2016/ Tony Corke

With the Tigers toppling the Swans with a goal after the siren on Saturday night, one of my Twitter followers, a Swans fan, wondered if there was a way to mathematically operationalise the notion of a "bogey team" and, more importantly, were the Tigers such a team for the Swans?

April 25, 2016

Projecting the Final Aggregate Score of an AFL Game In-Running : Part II

April 25, 2016/ Tony Corke

In the previous post we looked at some simple models for projecting the final score of an AFL game based solely on the scores at the quarter-time breaks. I said there that I'd revisit that modelling, incorporating pre-game market information, providing that I could source it in large enough volume.

April 24, 2016

Projecting the Final Aggregate Score of an AFL Game In-Running

April 24, 2016/ Tony Corke

It's quarter time, you've an Unders bet with a 180.5 threshold (ie you're betting that the final aggregate score will be 180 points or fewer) and you've just seen 40 points kicked in the 1st Quarter. How comfortable should you feel with you wager?

April 08, 2016

Another Look at Team MoSSBODS Ratings History

April 08, 2016/ Tony Corke

The MoSSBODS Team Rating System, while far from perfect, seems, based on its margin predicting performance across VFL/AFL history, to be capturing something useful about the underlying abilities of teams. Which is good, because that's what it was designed to do ...

March 28, 2016

Establishing Metrics for Margin, Total Score and Team Score Predictions

March 28, 2016/ Tony Corke

These days, I reckon I know what a good margin forecaster looks like. Any person or algorithm - and I'm still at the point where I think there's a meaningful distinction to be made there - who (that?) can consistently predict margins within 30 points of the actual result is unarguably competent. That benchmark is based on the empirical performances I've seen from others and measured for my own forecasting models across the last decade of analysing football.

March 24, 2016

A Brief History of VFL/AFL Aggregate Scores

March 24, 2016/ Tony Corke

With the move into Overs/Unders wagering this season, I've taken a special interest in the history of aggregate scores over the past few weeks. In this blog we'll review that history from a team, era and team pairing viewpoint.

March 20, 2016

Start-of-Season Team Ratings and Historical Flag Prospects

March 20, 2016/ Tony Corke

For today's blog, a simple assignment: investigate the historical relationship between team MoSSBODS Ratings at the start of the season and their subsequent ability to make the Grand Final, and to win it.

March 17, 2016

The Predictability of AFL Crowds

March 17, 2016/ Tony Corke

Many AFL fans, I reckon, would have a reasonably accurate internal model of what a good, average or poor crowd might be for a given contest.

March 13, 2016

The Sunday Effect: Accuracy and Scoring Shot Variation in AFL Contests (2000 to 2015)

March 13, 2016/ Tony Corke

Lately, while waiting for the competition to generate some new, meaningful new data to analyse, I've been looking at the history of VFL/AFL scoring, in particular Scoring Shot generation and Conversion Rates.

Statistical Analyses