I've railed in the past about the iniquity of the so-called "unbalanced draw", that is, the fact that teams don't get a chance to play every other team at home and away over the course of the season. Here on MAFL, railing requires empirical justification, so it's time I estimated the likely impact of this imbalance.
To do this I need to estimate the probabilities for each of the remaining games in the home-and-away season and estimate the probabilities for all those games that are "missing" from the schedule, that is, those games that would have taken place had every team played its full complement of 32 games. To estimate these probabilities I'm going to use a simple binary logit model that I developed for a presentation I recently gave at a meeting of the Sydney AnalystFirst community and again at a meeting of the Sydney Users R Forum, which provides an estimate of the probability of a home team victory given only the MARS Ratings of the participating teams. It was built using data from seasons 2000 to 2008.
The table below summarises the outputs of this model for some selected values of the Home team's and the Away team's MARS Ratings.
So, for example, a Home Team rated 1,020 taking on an Away team rated 1,040 has a 46% chance of victory.
Now to use this model and apply it to the remaining games in the schedule and to the "missing" games, we need to decide what Rating to use for each team. It seems fairest, since we don't know when the "missing" games would have been played during the season, to use each team's average MARS Rating for the season so far.
If we do that and then crunch all the numbers, we arrive at the following table (which you can click to access a larger version):
The block of data on the left provides each team's current and season average MARS Ratings as well as the relevant data from the current competition ladder (my version, where teams don't get 4 points for the bye).
Next comes a block of data about each team's remaining two home-and-away games and then a summary of their "missing" games. The upper row for each team provides the names of the teams it will meet or, in the case of the "missing" games, should have met, at home; the lower row provides the same information for teams that will be or should have been met at an away venue.
Then come the key columns:
- the expected number of wins for each team over the course of the remainder of the season plus those it would have been expected to win from its "missing" games. This expectation is simply the sum of the team's victory probabilities in each game.
- the expected number of competition points that the team would have accrued had it played a full home-and-away schedule. This is the competition points it has already accumulated plus 4 times the expected number of wins from the previous column.
- the ranking of the team based on the expected number of competition points from the previous column.
- the team's current ladder position.
And the conclusion? Only Carlton has anything of import to complain about, since the imbalance in the draw, according to my projections, has cost them the more-favourable fourth spot on the final ladder and allowed West Coast to snatch it. A quick look at the "missing" games for Carlton lends credence to this conclusion: four of the ten teams they fail to meet twice are the teams currently in ladder positions 14th through 17th.
Other than the Blues, however, the imbalance in the draw is projected to have remarkably little effect. No team makes the final 8 that shouldn't and, in fact, every other team in the 8 bar the Blues and the Eagles finish up in the ladder positions they already have.
Maybe I should be railing about the imbalance in the draw a little less often and less stridently ... but where's the fun in that?