Track record
Backtest summary
The discipline is the product. Forced to call every region-year, the model would be right 46% of the time; instead it stays silent unless its components agree, and on the 61 calls it actually makes that rises to 62.3%. The bad-year calls, the ones that matter most for risk, run at 92%.
Predicted vs actual
Every region-year in the walk-forward backtest plotted as one point. Top-right and bottom-left quadrants are direction-correct; top-left and bottom-right are direction-wrong. Dashed diagonal is perfect prediction.
Year-by-year
Years with 0 tradeable calls are years the model said “no high-conviction view” for every region, a feature, not a missed run.
| Crop year | Regions | Tradeable calls | Correct | Hit rate | Avg predicted anomaly | Avg actual anomaly |
|---|---|---|---|---|---|---|
| 2003/04 | 11 | 7 | 3 | 43% | +0.36 | +0.23 |
| 2004/05 | 11 | 0 | 0 | no tradeable | +0.14 | +0.15 |
| 2005/06 | 11 | 2 | 1 | 50% | -0.01 | +0.22 |
| 2006/07 | 11 | 0 | 0 | no tradeable | +0.06 | -0.74 |
| 2007/08 | 11 | 2 | 1 | 50% | +0.42 | +0.46 |
| 2008/09 | 11 | 0 | 0 | no tradeable | +0.06 | -0.05 |
| 2009/10 | 11 | 2 | 0 | 0% | +0.15 | +0.02 |
| 2010/11 | 11 | 3 | 1 | 33% | +0.25 | +0.28 |
| 2011/12 | 11 | 1 | 1 | 100% | +0.06 | -1.96 |
| 2012/13 | 11 | 4 | 4 | 100% | -0.46 | -0.84 |
| 2013/14 | 11 | 2 | 2 | 100% | +0.23 | +0.89 |
| 2014/15 | 11 | 3 | 3 | 100% | +0.27 | +1.70 |
| 2015/16 | 11 | 2 | 1 | 50% | +0.21 | +0.02 |
| 2016/17 | 11 | 3 | 1 | 33% | -0.21 | +0.57 |
| 2017/18 | 11 | 1 | 1 | 100% | -0.04 | -0.32 |
| 2018/19 | 11 | 6 | 2 | 33% | -0.23 | +1.52 |
| 2019/20 | 11 | 9 | 8 | 89% | -0.73 | -1.12 |
| 2020/21 | 11 | 2 | 0 | 0% | -0.08 | +0.36 |
| 2021/22 | 11 | 0 | 0 | no tradeable | +0.14 | n/a |
| 2022/23 | 11 | 3 | 1 | 33% | -0.18 | +0.38 |
| 2023/24 | 11 | 6 | 6 | 100% | -0.33 | -0.77 |
| 2024/25 | 11 | 3 | 2 | 67% | -0.17 | -0.44 |
| 2025/26 | 11 | 0 | 0 | no tradeable | -0.00 | n/a |
Case studies
2019, disaster year, 8 of 9 correct. The model called the wet autumn drilling and hot June flowering stress correctly across the wheat belt. UK average yield fell to 7.0 t/ha, the worst since 2012.
2023, wet harvest, 6 of 6 correct. Compound flowering and ripening stress flagged the below-average yield, with the strongest signal in Eastern.
2018, where we got it wrong, 2 of 6 correct. The model called bearish on heat stress; UK wheat actually benefited from unusually low disease pressure that year. The miss is the strongest argument for the sentiment layer that catches farmer reports of disease pressure in real time.
Methodology disclaimers
- Walk-forward only. Each year's call uses a model trained on prior years only.
- 2022 yield data missing from the source DEFRA dataset, that year is excluded from the hit-rate calculation rather than counted as a miss.
- Hit rate is direction-only (above / below average). Magnitude correlation separately reported (r=0.316, p<0.0001).
- Sentiment overlay currently affects displayed confidence, not the direction call itself. Future architecture iterations will integrate sentiment as a feature column once forward sentiment data accumulates.
Live forward calls
This is the forward-prediction log, the call CropIntel is making now, before the harvest confirms it. Unlike the backtest above, nobody knows the outcome yet. Harvest outcomes populate against each crop year as DEFRA confirms them (typically late August). This is the artefact to watch: a public, timestamped record of calls made ahead of the event. The public forward log began on ; the 2026 harvest is its first live, independently-verifiable validation milestone (the backtest above covers 2003-2025). It accrues one harvest at a time, which is precisely the part a new entrant cannot compress.
| As of | Crop year | National call | Confidence | Compound stress |
|---|---|---|---|---|
| 2026-05-26 | 2025/2026 | average | low | -0.04 |
| 2026-05-24 | 2025/2026 | Above average | medium | +0.05 |
| 2026-05-17 | 2025/2026 | Above average | high | +0.28 |
| 2026-05-10 | 2025/2026 | Above average | high | +0.33 |
| 2026-05-03 | 2025/2026 | Above average | medium | +0.42 |
One row per week, newest first. Today's full per-region breakdown: today's call.
Related reading: Methodology · Glossary · Case studies index