Picking Winners - A Deeper Dive

Last blog I identified a banker's dozen of algorithms that I thought were worthy of further consideration for Fund honours next season.

Experience has taught me that, behind the attractive veneer of some models with impressive historical ROIs often lurk troubling pathologies. One form of that pathology is exhibited by models with returns that come mostly from a handful of bets, one or two of them especially fortuitous. Another manifests as a 'bet large, bet often' approach that would subject any human on the business end of such wagering to the punting equivalent of a ride on The Big Dipper that's just as likely to end with you 100 metres above the ground as 200 metres below it. The question to be answered in this blog then is: do any of the 11 algorithms I've identified this time show any such characteristics?

Here's the summary of each algorithm's performance using Model variant 3d, for which the target variable is the (probability of) victory of the favourite and the explanatory variables are the MARS Rating difference between the favourite and the underdog, a binary variable that is 1 if the home team is also the higher MARS ranked team, and the implicit probability of the favourite, Prob, transformed as ln(Prob/(1-Prob)), the logit transformation.

(Click the image for a larger version)

Row 1 of this table provides data on the performance of the Flexible Discriminant Analysis algorithm. Across 50 replications of this algorithm on the data for 2006-2009, 2 optimal variants emerged, one of which would have wagered on only 5 games if used in 2010, and the other which would have wagered on only 33 games. The first variant made wagers averaging 3% of the Fund, produced an ROI of 48% and the average price of the winners it backed was $1.85. The second made wagers averaging about 12% of the Fund, produced an ROI of 35%, and backed winners at an average price of $1.98 - in short, a more active, bolder and less profitable variant than the first.

The 'No' in the last column denotes that this algorithm, though profitable across the period 2006 to 2009 considered as a whole, was not profitable in each of those four seasons taken separately. Registering a negative in this column is not grounds for the outright dismissal of an algorithm - after all, the ROI optimisation was performed on the 4 seasons taken as a whole, not on each individual season - but a positive result here would certainly tick another metaphorical box.

The penultimate, Magic Number column is one about which I want to linger for a few paragraphs. As I've defined it, the Magic Number for an algorithm's results is the number of games which, if the outcome were reversed, would have dragged the algorithm's performance into the red. 

The formula for this Magic Number is derived as follows:

Consider a Fund that makes W wagers of average size S, from which it generates a Return on Investment of ROI.

Now ROI is defined as Net Return / Total Outlay, which we can rearrange to give

Net Return = ROI x Total Outlay = ROI x W x S.

If we reverse the result of k wagers, each of size S and each placed at a price of $P then the Net Return changes by -k x S x (P-1) - k x S (ie the return we made from the k successful wagers and the amount we wagered), which simplifies to -k x S x P.

To drag the original Net Return below zero we need this change to exceed the original Net Return, which means we need

k x S x P > ROI x W x S

or, rearranging (and recognising that S and P must both be positive, an important technicality to address when you're playing with inequalities, a consideration that I recall being a centrepiece of one or two maths lessons way back in Year 10), we need

k > W/P x ROI (In practice, we take the next highest integer value for k).

In words, the number of results that we need to reverse to turn an algorithm's profit into a loss is given by the number of wagers that the algorithm made divided by the price of the wagers whose result we're reversing, multiplied by the ROI that the algorithm produced.

The larger is k, the greater the number of such reversals that would have been required to plunge the algorithm from profit into loss which, ceteris paribus, is a 'good thing'.

So, an algorithm's ROI is 'better' - less likely to be the result of random good luck, if you like - if it:

  • is higher
  • is the result of a larger rather than smaller number of wagers
  • was derived from wagering on teams at shorter average prices

Considering all the information I've laid down here about how the Flexible Discriminant Analysis (FDA) algorithm would have made money in 2010 - not least that, for one variant, changing only a couple of results would have turned its profit into a loss - the FDA algorithm is now not one that I'd consider putting in charge of my or anyone else's money.

(Bear in mind that a user of the FDA algorithm would not to get to choose which variant he or she would be subjected to in any given season. The variant that emerged as the chosen one in each replicate depended on the particular data that was selected in cross validating and tuning the algorithm. That selection is a random one and, a priori, there would be no basis on which to select a "better" sample - only once you knew the results would you know which variant you'd rather.)

After FDA, the next three lines show the results for five other algorithms (two pairs returned exactly the same results) all of which I'd label as conservative wagerers as they bet on only about 60-70 games a season and bet only about 2.5-3% per wager. Each algorithm produced an 18% ROI for 2010.

Penalised Discriminant Analysis' (PDA's) Magic Number of 7 - which represents almost 10% of the total number of wagers it made - is large enough to provide some assurance about the reproducability of its performance, though the 'No' in the final column acts as something of a counterweight to that feeling. For now, I'd be willing to keep PDA and LDA on the consideration list.

Next comes the data for the Quadratic Discriminant Analysis (QDA) algorithm. In 2010, its 12% ROI came from the combination of a 34 and 6 record betting on favourites at an average price of $1.12 and a 36 and 35 record betting on underdogs at an average price of $2.50. Its Magic Number of 11 is attractive, though its inability to generate a profit for every in-sample season - it made an 8% loss in 2006 - gives me some pause. Wagering one-sixth of its Fund each time, and that only on average, is clearly unacceptable, though of course this could be managed by using fractional Kelly rather than full Kelly wagering. So, QDA's a maybe too.

I'd rule out GAM Spline, mainly because the worst of its three variants has a Magic Number of just 4 and, as I noted parenthetically above, we don't get to choose the variant we wind up with.

GAM, GAM Loess and Conditional Inference Tree Forest I'd also rule out on the basis of small Magic Numbers. 

So that leaves PDA, LDA and QDA - all members of the Discriminant Analysis family - as contenders for now.