The Problem With Handling Not Out Scores In Cricket

In cricket, one measure of a batsman's worth is the number of runs the player scores per innings, which is known as the player's average. Calculating this statistic is straightforward for any batsman who's obligingly and unfailingly managed to be dismissed prior to the completion of their team's innings as then it's just a simple division of total runs scored by number of innings batted, but it's problematic as soon as a batsmen has registered a "not out".

Treating a not out innings as equivalent to a completed innings seems unfair. In effect such treatment is equivalent to assuming that, had the team's innings continued, the batsman would have scored no further runs. That's a possible outcome, but surely not the most likely one (Chris Martin aside).

Tradition has it that, instead, a player's average will be calculated by dividing the total number of runs the player has scored by the number of completed innings he or she has had. Runs scored in innings where a player remains not out are therefore considered something of a bonus, and a player considered to be playing cautiously near the end of his or her team's innings in an effort to remain not out is sometimes accused of "playing for his (or her) average".

With a little maths we can answer a simple question: after a not out innings, how many extra runs would a player have needed to score before being dismissed in order to have the same average at the end of the innings?

First, let's define some terms: 

  • R0 is the number of runs the player had scored prior to the current innings
  • S is the not out score in the current innings
  • I is the number of completed innings for this player prior to the current innings

Using these terms, prior to the current innings the player's average would have been R0/I and, after the current innings, it became (R0+S)/I.

We want to know what the value of E is, the extra runs that he or she would have needed to score before being dismissed, so that (RO+S+E)/(I+1) = (RO+S)/I. 

Solving this for E yields E = (R0 + S)/I, which is the player's average before the current innings + S/I. Relative to a player's average, S/I will generally be small, so we can say that:

The current method of incorporating a not out score in a player's average is equivalent to assuming that he or she would have scored before being dismissed in addition to whatever score he or she had accumulated as many runs as was his or her average prior to the innings.

So, for example, if a player with an average of 30 remains 27 not out at the end of the team's innings, as far as her average is concerned it's as if she batted on and scored 57 - the 27 she already had plus her average.

How reasonable is this assumption is practice?

One obvious criticism is that players tend to have a period of additional vulnerability early in their innings and that the likelihood of their being dismissed diminishes with the length of their innings. If that's the case then effectively crediting them with their average if they remain not out 0 at the completion of their innings seems generous, and doing the same if they're 100 not out seems unfair.  

Put another way, what this suggests is that the extra runs a player scores before being dismissed is a function of the runs he or she has already scored.

We can test this hypothesis by calculating conditional averages for the actual scores of some players. Here's Ricky Ponting's (as at 14 Feb 2011). 

Looking firstly at the numbers for Tests, we see that, in those innings where he is eventually dismissed, once Ricky has made 0 runs, on average he scores 46.3 runs more before he is dismissed. If we add the runs he scores in not out innings (and divide by the number of completed and not out innings), then this average increases to 47.7 runs.

Perhaps more interestingly, once he gets to 30 he's good for another 54 or so runs.

One thing you'll notice here is that there's no evidence of Ricky scoring a larger number of additional runs before being dismissed once his score reaches about 10. This suggests that the current treatment of not out innings might not have much of a distortionary effect on player averages - well, on Ricky's at least.

Actually, to make that argument we need more than that the number of additional runs scored be roughly constant, we need it to be roughly constant at around Ricky's average.

And, for Tests, it is, give or take a run or two.

For ODI's, however, the treatment of not outs is almost certainly distortionary. Even choosing the most generous example where Ricky has already scored 10 and including all of his not out innings, he still is only good for, on average, an extra 38 runs, not the 43 runs that he's effectively being credited with.

The problem, I think, stems from the inherently truncated nature of the limited overs format, which means that, if anything, a player is more likely to be dismissed as his innings progresses and balls remaining dwindle - a factor that can be seen to some extent in Ricky's statistics for scores of 40 and over.

At this point you might well be wondering if these conclusions apply only to Ricky Ponting. Well, here's the same analysis for Sachin Tendulkar.

 

For both Tests and ODIs, the overall pattern of Sachin's data is similar to that for Ricky. There's certainly no evidence for the hypothesis that he's likely to score more additional runs the more he's already scored in either form of the game, once he's reached about 20.

Again we need to compare these numbers to his career statistics.

Here too we find for Sachin in Tests that his conditional average scores are just a few runs shy of his traditional average, but for ODIs the difference is much greater.

It would be an interesting exercise to complete this analysis for some other players of different styles and averages - perhaps a Sehwag, a Gilchrist or a Smith - but that's a task for another weekend.

In the meantime, my tentative conclusion would be that the current treatment of not out innings does little to distort a player's test average but probably inflates his ODI average.