Today's blog is motivated by a number of things, the first of which is alluded to in the title: the quantitative exploration of the contributions that teams' underlying class or skill plays in their success in a given game relative to their more recent, more ephemeral form. Is, for example, a top-rated team that's been a little out of form recently more or less likely to beat a less-credentialled team that's been in exceptional form?
In addition, I've been determined for some time now to create a model worthy of tipping and maybe even wagering attention that does not take as an input any bookmaker pricing data. That too is something I've addressed in this blog.
DATA AND APPROACH
Today we'll be fitting models to estimate:
- the home team's victory margin, and
- the home team's victory probability
For model-fitting I'll be using data for the period 2000 to 2013 (to the end of the Preliminary Finals). Both models will be assessed on a purely in-sample basis, but only the model fitted to margin data will be assessed across the entire time period. The model created to provide probability estimates will be assessed, instead, only against the data for the 2006 to 2013 period as this will allow me to compare the model's probability assessments with those that can be inferred from TAB Bookmaker prices, for which purpose I only have trustworthy data for this shorter, 8 year timeframe.
I'll measure a team's "class" by its pre-game MARS Rating and, for the first time, I'll measure its "form" by the sum of its MARS Rating changes over the past X games. (On a slightly technical side-note, the MARS Ratings timeseries that I'll be using is the one where all teams started with a Rating of 1,000 in 1999, not the all-time Ratings. The differences are not large.) As well as including these MARS-derived variables for both teams as candidate regressors, I'll also include their Venue Experience, the Interstate Status of the clash and, mindful of recent blogs about how Finals are different from Home and Away season games, a binary variable denoting whether or the not the game is a Final.
Since I've no a priori basis for determining the appropriate number of games over which to sum a team's recent MARS Rating changes - I don't know when form ends and class begins, if you like - I've decided to sum over the most recent 2, 3, 4, 5, 6, 8 and 10 games for each team, allowing those sums to span different seasons if necessary.
To build the models, I used the latest version of Eureqa, which allowed me to have the various regressors competing for inclusion in the least complex, best fitting models.
Following are the models that Eureqa provided.
The home team Predicted Margin model - for which I used the default Absolute Error metric in Eureqa, which essentially means that I'm building a model to minimise the mean absolute prediction error - is impressively simple.
It says that the best estimate of the home team's winning margin is 134 plus the sum of the home team's MARS Rating changes in its two most recent games plus 12 if the game is an Interstate clash plus about two-thirds of the home team's MARS Rating less about 80% of the away team's.
Now a good win in the AFL against competent opposition will elevate a team's MARS Rating by about 2 or 3 Ratings Points. Two such games back-to-back will, according to this formula, add about 5 to 10 points to the home team's predicted victory margin (remembering that the Home_MARS_Rating variable will also increase as a result of these victories).
One interesting aspect of the model is that it includes only the MARS Ratings change variable based on the 2 most recent games. Any earlier results are, apparently, either not relevant to the current game or are sufficiently well-reflected in a team's current MARS Rating that they need no additional prominence. This finding is broadly consistent with the modelling work I did to create the algorithms for the Line and Head-to-Head Funds, where I found worthy of inclusion variables summarising each team's points for and against in their two most recent games. What's different, and appealling, about using the sum of changes in MARS Ratings instead is that it naturally adjusts for the quality of the opposition against which these Ratings changes have been achieved.
For the second model, where now I'm endeavouring to create well-calibrated victory probabilities for the home team, Eureqa saw fit (using a squared error metric) to include the same variables as in the margin model, but also added the binary variable flagging whether or not a game was a Final. This model suggests that the home team's recent form is only important in the home and away season; in Finals, class (and venue) is all that matters.
With this model as our guide, the team adding 6 RPs in its two most recent games will reduce its odds by a factor of exp(6*(0.049+0.032)), which is about 1.6. So, for example, a team that would otherwise have been a 3/1 proposition would become about a 13/7 underdog instead.
This model also suggests that consideration of a team's form need extend back only to its two most recent games.
MEASURING THE FIT
I've not created a holdout sample with which to estimate the true generalisability of either of these models, so some caution should be exercised in relation to the results I'm about to present. Eureqa's inbuilt splitting of the datasets provided to it into test and training samples, coupled with my selection of models in the mid-range in terms of the complexity of those offered by Eureqa, should, however, have averted the most egregious forms of overfitting.
First, let's consider the Season-by-Season performance of the model created to predict game margins. Its mean absolute prediction error across the 14 seasons is 29.5 points per game, an undeniably strong performance, and its 2013 MAPE of 27.0 points per game would put it near the top of the MAFL Leaderboard - and that, bear in mind, with absolutely no bookmaker input.
Next consider the Season-by-Season performance of the model created to predict game probabilities. The figures shown here are log probability scores (actually 1 + log probability scores, the same metric as I use on the MAFL Leaderboard). For comparative purposes I've included the scores recorded by inferring the home team probability from the TAB Bookmaker's head-to-head prices adopting a Risk-Equalising approach to unwinding the overround in those prices (see the first diagram in this blog for the technical details of its calculation). As you can see, the model created by Eureqa compares most favourably to the TAB Bookmaker's performance.
Eureqa, though rarely superior to the TAB Bookmaker in terms of log probability across an entire season, is mostly near enough to make home team only wagering profitable. (We've seen in previous posts that a punter need not be a superior probability assessor in comparison to the Bookmaker he or she faces in order to record positive returns to wagering; he or she needs only be not "too" inferior.)
The three columns on the right of the table, which I've calculated only for seasons 2006 and later, shows the Eureqa model returning positive ROIs in 5 of the 8 seasons, and returning a +6% ROI across the entire period.
Again I'd stress that these results are achieved with a fitted model with no holdout, but they do offer some hope for a model with no bookmaker input.
We've had evidence before that MARS Ratings require a period of calibration in each season before they reach acceptable levels of accuracy for the purposes of wagering. So, here's the performance data for those same models on a Round-by-Round basis.
As foreshadowed we do see signs of weaker performance in the early rounds of the season for both models.
The model fitted to game margins records its second-worst performance of any round in Round 4s and records above-average mean APEs for most of the first 10 rounds of the season.
Interestingly, this model is also quite weak from about Round 19 onwards to the end of the home and away season, which is when, I'm increasingly coming to believe, teams' relative levels of motivation are at least as important as their relative class and recent form.
Turning next and lastly to the Eureqa model designed to provide home team victory probability assessments, we see that the model turns in its worst performances in a log probability scoring sense in Rounds 1 and 4, and one of its worst ROIs in Round 6. By refraining from wagering in the first six rounds of the season, the ROI from wagering based on this model's outputs more than doubles to almost 13%.
The model also records some hefty losses in the dying rounds of the home and away season, which might also indicate that wagering in that part of the season requires a different approach, incorporating aspects of a team's finals aspirations as well as measures of its class and form.