2018, where we got it wrong, 2 of 6 correct

Crop year 2017/18 · published 2026-05-02

The cleanest miss in the backtest
The walk-forward model averaged a −0.23 anomaly across the regions in 2017/18, a mildly bearish call driven by hot, dry conditions through mid-summer that historically signal heat stress at flowering. The actual outcome was +1.52 anomaly, comfortably above-average. 6 tradeable calls, only 2 correct (33%).

By the numbers

-0.23
Predicted anomaly (t/ha), below-average
+1.52
Actual anomaly (t/ha), above-average
2 of 6
Tradeable regions correct

Walk-forward, the model was trained only on years before this one. Every figure is a plotted point on the track-record scatter.

What the model expected

2018 had the famous "Beast from the East" cold spring followed by a record-hot, dry summer. The model's compound stress score registered the heat as a stressor, historical analogues from 1995, 2003, and 2010 showed UK wheat yields softening when June and July temperatures pushed above the 30°C threshold during flowering and grain fill.

Six regions crossed the model's high- or medium-confidence threshold on the bearish side, including most of the eastern wheat belt. Average predicted anomaly: −0.23 t/ha vs trend.

What actually happened

UK wheat had its best harvest in years. Average yield came in at 8.0 t/ha vs the ~7.5 t/ha five-year average, about +1.5 t/ha above the model's prediction.

The dry conditions helped UK wheat that year, not hurt it, because:

Why the model missed

The compound stress score is calibrated to weather features. It doesn't have a feature representing disease pressure or low-disease bonus, both of which are real phenomena that meaningfully affect UK wheat outcomes. A dry summer in the model's training set was a stressor on average; in 2018 specifically it was a benefit, because the offsetting low-disease bonus dominated.

The miss is unfixable for historical years (no historical sentiment data going back). But it directly motivates the sentiment layer, by reading real-time farmer reports of disease pressure, the system has a feature that catches exactly what the compound-stress score is blind to.

Honest takeaway

Three things worth saying directly about this miss:

  1. The model is calibrated to compound weather stress and is genuinely blind to low-disease years. Anyone using the system needs to know that. We don't claim accuracy we don't have.
  2. The sentiment layer is the architectural answer. Going forward, if farmers en masse report low disease pressure (which they did in 2018), the sentiment overlay multiplier pushes the displayed confidence below 1.0, making the call less confident even when the weather features alone say bearish.
  3. The 33% hit rate in 2018 is the kind of miss that proves the rest of the backtest is not in-sample-fit. A model that had peeked at the answers wouldn't show a miss this clean.

The economic benefit, by user type

2018 is the case that proves the value isn't a single number, it's a transparent, auditable system whose limits are stated. Here's how each user type should read a miss:

Grain merchant. This is the year the signal would have misled you, a bearish call into a bumper harvest. The honest lesson for a buyer: a single call is never a trade on its own. The published, walk-forward track record exists precisely so you can size how often this happens (it's the rare case), and the consensus filter withholds low-conviction calls rather than dressing them up.
Crop insurer. A reserving decision taken on this call alone would have over-reserved. The value to an underwriter isn't blind trust, it's an independent, auditable signal weighed alongside your own models, with its blind spots (low-disease years) documented up front.
Commodity desk. A desk that had read the methodology would have known the model is blind to the low-disease bonus and discounted a heat-stress-driven bearish call accordingly. Knowing what a signal can't see is itself worth paying for.

Illustrative, the figures show the magnitude and direction of the decision, not a guaranteed return. The model is directional and probabilistic (62.3% walk-forward). See who this is for for per-segment detail.

Who this is relevant to

CropIntel's signal, and the underlying system (available to acquire), is relevant across the UK and European arable value chain:

Representative firms by segment, illustrating breadth, not claimed clients or relationships. If your firm is on this map and the track record is interesting, start a conversation.

Related: All case studies · Methodology · 2019 (right) · 2014 bumper year (right, bullish) · 2023 (right)