← All articles

Quant

Your backtest is lying to you

2026-06-06 · 6 min read

Here's an uncomfortable truth about testing trading strategies on history: it is trivially easy to build one that looks brilliant on the past and is worthless going forward. Not because you cheated — because you tuned too hard. This is overfitting, and spotting it is most of the job.

Grading the exam with the answer key

Suppose you tweak a strategy's rules — thresholds, indicators, holding times — until its results on your historical data look great. Then you test it on that same data and declare victory. You've just graded the exam using the answer key. Of course it scored well; you optimised it for exactly those questions.

Overfitting is fitting the noise instead of the signal. Every extra parameter is another knob you can turn to hug the random wiggles of the past. Enough knobs and you can "explain" anything — including pure luck.

The fix: hold data back

In-sample versus out-of-sample In-sample · you tune here Out-of-sample · test once tuning stops tuned backtest what's real
Both curves look identical while you're tuning. Only the untouched period shows which was real.

The defence is discipline about data. Split your history in time: tune on an earlier in-sample period, then test once on a later out-of-sample period you never looked at while tuning. If the edge survives there, it might be real. If it evaporates, you were fitting noise. Walk-forward testing repeats this, rolling the window forward through time, so you're not betting everything on one lucky split.

Rule of thumb

If an edge only shows up in the exact data you tuned it on, it isn't an edge. It's a memory.

Warning signs you're fooling yourself

  • Too many parameters relative to the number of trades — you have more knobs than evidence.
  • Fragility: nudge a threshold slightly and the great result collapses. Real edges are usually robust to small changes.
  • One lucky year carrying the whole track record — strip it out and there's nothing left.

None of this makes backtesting useless — it makes it honest. The goal was never a pretty equity curve on the past. It's a strategy whose edge is still standing on data it has never seen.


BacktestingOverfittingQuantValidation
← Back to the blog