When comparing different batsmen, the statistic that is invariably brought out is the batting average. It is a fair enough indicator of a batsman's ability too, for it suggests the number of runs he scores per dismissal - Brian Lara makes 53 runs per dismissal to Ramnaresh Sarwan's 40, hence Lara is clearly a superior batsman.
While the efficacy of averages is inarguable, it has its limitations. For instance, it doesn't tell us the consistency levels of a player: a batsman who scores 0, 200, 25 has exactly the same average - 75 - as one who makes 70, 80, 75, though it's obvious which one of the two has been more consistent.
Enter a statistical tool called the standard deviation. As the name
suggests, this method indicates how much a sequence of numbers deviates
from its average.
The problem with average is that if
one leg of yours is in the oven, and the other in a freezer, on an
average you are comfortable. This is a big learning in statistics – the
word average makes no sense if the standard deviation is very high. Thats why if a batsmen scores 200 in one inning and goes for duck in next 3, he would still have an excellent average of 50, but player not consistence. that means standard deviation is very high.
You'd obviously want greater consistency from a batsman, but check this
sequence out: 16, 15, 17, 20, 22, 14, 18. Mr X is obviously extremely
consistent - the standard deviation is only 2.61 - but at an average of
17.43, he isn't doing much to help the cause of his team.
In the two run-sequences given earlier, for example, the second one has
a standard deviation of just 4.08, while for the first, it's a whopping
88.98
A meaningful stat, then, is one which combines batting averages - for
that is an indication of the sheer volume of runs he scores each time he
bats - with a consistency index which measures how much he deviates
from his average score. For the purpose of this exercise, the batting
average has been divided by the standard deviation to arrive at an
index. Batting index is exactly inverse to another stastical term called coefficient of variation (CV) which is defined as the ration of standard deviation to mean.
The table below lists the ones with the most favourable batting index for players with at least 5000 Test runs, and it's interesting to see the ones who make the cut. On top of the ranking is Jacques Kallis, the batting machine from South Africa
Batsman | Runs | Average | SD | Batting index (Average/ SD) |
---|---|---|---|---|
Jacques Kallis | 7940 | 56.31 | 44.54 | 1.26 |
Allan Border | 11,174 | 50.56 | 40.49 | 1.25 |
Ken Barrington | 6806 | 58.67 | 47.36 | 1.24 |
Jack Hobbs | 5410 | 56.95 | 46.68 | 1.22 |
Arjuna Ranatunga | 5105 | 35.70 | 29.44 | 1.21 |
Here are some other star player who had not made it to top 10 either, Ricky Ponting (1.13), Rahul Dravid (1.12), Adam Gilchrist and Sourav Ganguly (both 1.10). Inzamam-ul-Haq manages an index of 1.07, while Sachin Tendulkar has 1.03, both slightly better than two stalwarts from the 1980s, Sunil Gavaskar and Viv Richards (both 1.02, rounded off to the second decimal).
Let's now lower the bar to 3000 runs and look for consistency alone. How many would have guessed that Shaun Pollock would have had the lowest standard deviation among this group? In fact, the top six are all lower middle order batsmen who have consistently bailed their teams out in crises. Their averages aren't so impressive, but the standard deviations indicate just how consistently they have performed.
Batsman | Runs | Average | SD |
---|---|---|---|
Shaun Pollock | 3406 | 31.25 | 23.44 |
Rodney Marsh | 3633 | 26.52 | 25.91 |
Richard Hadlee | 3124 | 27.17 | 26.31 |
Mark Boucher | 3357 | 29.97 | 26.65 |
Ian Healy | 4356 | 27.40 | 26.69 |
Jeff Dujon | 3322 | 31.94 | 29.01 |
[All the stats from test cricket only]
Quick Fact - Don Bradman, had a staggering average of 99.94 everyone knows, but at the same time a standard deviation of nearly 87 is also highest (most inconsistent) among all batsmen with at least 3000 run.