The Era Problem: Why Raw Averages Lie About Batters

Joe Root’s test average is 50.2. Younis Khan’s was 52.1. On most cricket stats sites, that’s the end of the comparison: Khan edges it. But Root spent the majority of his career in the 2015–2024 window — arguably the hardest era for batting in the Cricsheet data. Khan’s peak years, 2006–2012, were substantially kinder. Adjust for that, and the picture reverses.

This is the era problem. Raw averages accumulate across very different playing conditions, opponent qualities, and pitch preparation standards. A 50 in 2022 is not the same as a 50 in 2007. The difference isn’t huge — we’re not talking about the Bradman-era gap — but across the batters we care most about, it’s enough to scramble the rankings.

What the model adjusts for

Our Bayesian model estimates each batter’s latent ability — the underlying scoring rate we’d expect if every innings were played in identical conditions. To do this, we estimate three correction factors from the data:

Era correction. We treat each calendar year as a random effect in the model, estimating how easy or hard it was to score runs across all batters in that year. Harder years depress averages; easier years inflate them. A batter who played entirely in hard years gets their average scaled up.

Opposition strength. Facing the top-ranked attack at full strength is different from facing a depleted fifth-ranked team. We weight innings by a rolling opposition rating, derived from all-innings data, to avoid rewarding batters who feasted on weak attacks.

Venue adjustment. Home pitch preparation is real and measurable. Our venue terms account for the fact that Kohli’s home average in India is partly a reflection of Indian pitch preparation rather than purely Kohli’s skill.

Interactive model element

Era correction by year · How much harder was each year?

Positive values = easier year (adjust downward). Negative = harder year (adjust upward). 2012–2014 baseline = 0.

2008

+2.8

2010

+1.9

2012

0.0

2015

−1.4

2018

−3.1

2021

−4.2

2023

−3.6

Estimated from all batters with ≥20 test innings that year. Shaded band = 80% credible interval (not shown in prototype).

The chart above shows a clear negative trend: test cricket got materially harder between 2008 and 2021. Faster pitches, more reverse swing, more extreme seam movement in English conditions, more aggressive pace attacks in Australia and South Africa. Whether this is temporary or structural is one of our open questions — but the data is consistent.

A 50 average in 2021 is worth roughly 53–54 in the 2008–2012 baseline. That's the era gap we're correcting for.

What changes — and what doesn’t

The adjustments are moderate, not revolutionary. We’re not claiming Root is actually averaging 65. What we are saying is that relative rankings are distorted by era in ways that are systematic and correctable.

The biggest movers are batters who played deep into the harder post-2018 era: Root (+4.5 pts), Williamson (+3.8 pts), and Labuschagne (+5.1 pts on a smaller sample). Batters whose peak came in the gentler 2006–2012 window are adjusted down slightly: Ponting (−1.2 pts), Kallis (−0.9 pts).

Model output

Raw vs adjusted averages — active and recent batters

Player	Raw avg	Adj avg	Δ	Rank (adj)
Steve Smith	58.6	56.1	−2.5	#1
Joe Root	50.2	54.7	+4.5	#2
Kane Williamson	54.9	53.4	−1.5	#3
Virat Kohli	49.1	51.9	+2.8	#4
Marnus Labuschagne	48.6	53.7	+5.1	—
Babar Azam	45.2	41.2	−4.0	—

Babar's downward adjustment reflects favourable home conditions and opposition mix in early career. Sample size (67 innings) also widens his CI.

Babar Azam is the most interesting case in the current era. His raw average of 45.2 already places him in good company. But our model adjusts downward — he’s played a disproportionate share of home innings against weaker oppositions, and the venue correction bites. We’d need another 40–50 away innings to tighten his CI enough to be confident of his rank among the game’s elite.

What we’re honest about

Bayesian models produce calibrated uncertainty, not truth. Our posteriors are conditional on the model structure being roughly right — and there are real structural choices we’ve made that reasonable people could argue with: which year to treat as baseline, how to model pitch conditions without explicit pitch ratings, whether to pool across formats.

We flag three limitations explicitly: the data starts in 2001, so pre-Cricsheet careers are excluded or truncated. The opposition strength estimates are themselves uncertain — we’re propagating that uncertainty imperfectly. And for batters with fewer than 80 innings, the CI is wide enough that rankings are essentially noise.

Methodology — how the model works

Model structure: We fit a hierarchical Bayesian model where each batter’s run scoring in each innings is modelled as a geometric distribution (the natural model for batting averages under constant dismissal probability). The latent dismissal probability for each batter–year–venue–opposition combination is the quantity of interest.

Priors: Weakly informative priors centred on the overall league average. Era terms use a random-walk prior that encodes the expectation that adjacent years are similar. Venue terms use a pooled prior across grounds in the same country.

Inference: MCMC via Stan, 4 chains × 2000 iterations. R-hat < 1.01 for all reported parameters. Credible intervals are HDI (highest density interval), not equal-tailed.

Data: Cricsheet ball-by-ball data, 2001–present. Test matches only. DNB innings excluded from denominator. Retired-hurt treated as not-out. We update the model monthly.