Track record

Walk-forward backtest, 1999–2025. Each region-year is predicted by a model trained only on data available before that year, no in-sample fit, no look-ahead.

How to read this page
Tradeable = the model called medium- or high-confidence (i.e. a call a desk would actually take). p<0.0001 on predicted-vs-actual yield anomaly (r=0.316) across the 231 scored region-years. Year-by-year and case-study breakdowns below.

Backtest summary

62.3%
Tradeable hit rate (38/61)
92%
"Bad year" calls (24/26)
72%
High-confidence (13/18)
46%
All 231 region-years

The discipline is the product. Forced to call every region-year, the model would be right 46% of the time; instead it stays silent unless its components agree, and on the 61 calls it actually makes that rises to 62.3%. The bad-year calls, the ones that matter most for risk, run at 92%.

Predicted vs actual

Every region-year in the walk-forward backtest plotted as one point. Top-right and bottom-left quadrants are direction-correct; top-left and bottom-right are direction-wrong. Dashed diagonal is perfect prediction.

-2 -2 -1 -1 +0 +0 +1 +1 +2 +2 Predicted yield anomaly (ensemble) Actual yield anomaly Direction correct: 112/171 (65%)

Year-by-year

Years with 0 tradeable calls are years the model said “no high-conviction view” for every region, a feature, not a missed run.

Crop year Regions Tradeable calls Correct Hit rate Avg predicted anomaly Avg actual anomaly
2003/04 11 7 3 43% +0.36 +0.23
2004/05 11 0 0 no tradeable +0.14 +0.15
2005/06 11 2 1 50% -0.01 +0.22
2006/07 11 0 0 no tradeable +0.06 -0.74
2007/08 11 2 1 50% +0.42 +0.46
2008/09 11 0 0 no tradeable +0.06 -0.05
2009/10 11 2 0 0% +0.15 +0.02
2010/11 11 3 1 33% +0.25 +0.28
2011/12 11 1 1 100% +0.06 -1.96
2012/13 11 4 4 100% -0.46 -0.84
2013/14 11 2 2 100% +0.23 +0.89
2014/15 11 3 3 100% +0.27 +1.70
2015/16 11 2 1 50% +0.21 +0.02
2016/17 11 3 1 33% -0.21 +0.57
2017/18 11 1 1 100% -0.04 -0.32
2018/19 11 6 2 33% -0.23 +1.52
2019/20 11 9 8 89% -0.73 -1.12
2020/21 11 2 0 0% -0.08 +0.36
2021/22 11 0 0 no tradeable +0.14 n/a
2022/23 11 3 1 33% -0.18 +0.38
2023/24 11 6 6 100% -0.33 -0.77
2024/25 11 3 2 67% -0.17 -0.44
2025/26 11 0 0 no tradeable -0.00 n/a

Case studies

2019, disaster year, 8 of 9 correct. The model called the wet autumn drilling and hot June flowering stress correctly across the wheat belt. UK average yield fell to 7.0 t/ha, the worst since 2012.

2023, wet harvest, 6 of 6 correct. Compound flowering and ripening stress flagged the below-average yield, with the strongest signal in Eastern.

2018, where we got it wrong, 2 of 6 correct. The model called bearish on heat stress; UK wheat actually benefited from unusually low disease pressure that year. The miss is the strongest argument for the sentiment layer that catches farmer reports of disease pressure in real time.

Methodology disclaimers

Live forward calls

This is the forward-prediction log, the call CropIntel is making now, before the harvest confirms it. Unlike the backtest above, nobody knows the outcome yet. Harvest outcomes populate against each crop year as DEFRA confirms them (typically late August). This is the artefact to watch: a public, timestamped record of calls made ahead of the event. The public forward log began on ; the 2026 harvest is its first live, independently-verifiable validation milestone (the backtest above covers 2003-2025). It accrues one harvest at a time, which is precisely the part a new entrant cannot compress.

As of Crop year National call Confidence Compound stress
2026-05-262025/2026averagelow-0.04
2026-05-242025/2026Above averagemedium+0.05
2026-05-172025/2026Above averagehigh+0.28
2026-05-102025/2026Above averagehigh+0.33
2026-05-032025/2026Above averagemedium+0.42

One row per week, newest first. Today's full per-region breakdown: today's call.

Related reading: Methodology · Glossary · Case studies index