The Illusion of Simulated Incubation, Part 2: A quick Python Demonstration

May 11, 2026
5 min read

Home
blog
The Illusion of Simulated Incu ...

In Part 1 I argued that simulated incubation, declaring the last 12 months of data an out-of-sample window “because we could have designed the strategy a year ago” , inverts the very property that makes incubation informative: the temporal asymmetry of knowledge. I broke the bias into four channels: implicit look-ahead, publication selection, retroactively informed design, and absence of commitment.

In this part I show that the bias is not abstract. A 15-line Monte Carlo simulation demonstrates that the procedure produces a systematic Sharpe signal of ~0.78 on data with zero true alpha by construction. The signal is pure selection artifact, but it’s indistinguishable to the researcher from genuine forward evidence.

The setup

I generate “strategies,” each a Gaussian i.i.d. noise process with . Three time windows:

in-sample: days (≈ 9 years)
incubation: days (1 year)
true out-of-sample: days (1 year)

True alpha is identically zero for every strategy and every window — by construction.

I compare two selection processes:

Honest process. The researcher never saw the incubation window during design. Selects the top-10 strategies by in-sample Sharpe. The incubation window and the true-OOS window are both genuine forward observations.
Simulated incubation. The researcher selects the top-10 strategies by maximizing Sharpe on the combined in-sample + incubation window — the cleanest formal representation of the implicit contamination described in Part 1.

The code

import numpy as np
np.random.seed(42)

N, T_IS, T_INC, T_OOS = 500, 2250, 250, 250
sigma = 1/np.sqrt(252)
sharpe = lambda r: r.mean(-1)/r.std(-1)*np.sqrt(252)

r_is = np.random.randn(N, T_IS) * sigma
r_inc = np.random.randn(N, T_INC) * sigma
r_oos = np.random.randn(N, T_OOS) * sigma

# Honest: select on IS only
top_honest = np.argsort(sharpe(r_is))[-10:]
# Simulated: select on IS + INC jointly
top_sim = np.argsort(sharpe(np.c_[r_is, r_inc]))[-10:]

print(“HONEST    | INC:”, sharpe(r_inc[top_honest]).mean().round(2),
      “| OOS:”, sharpe(r_oos[top_honest]).mean().round(2))
print(“SIMULATED | INC:”, sharpe(r_inc[top_sim]).mean().round(2),
      “| OOS:”, sharpe(r_oos[top_sim]).mean().round(2))

The result

Averaged over 500 independent seeds:

Process	INC Sharpe (mean ± std)	True OOS Sharpe (mean ± std)
Honest	0.01 ± 0.32	0.01 ± 0.31
Simulated incubation	0.78 ± 0.29	0.01 ± 0.31

The simulated process exhibits an expected Sharpe of ~0.78 in the “incubation” window where α is identically zero by construction. In true OOS the two processes are indistinguishable. The signal produced by simulated incubation is pure selection artifact — not a measure of predictive power.

The figure below shows the full distribution:

Monte Carlo: simulated incubation vs true OOS

The left panel shows the distribution of incubation Sharpe under the two processes. The right panel shows the same quantity over a genuine OOS window. In the left panel the two distributions are visibly separated; in the right panel they are indistinguishable. Simulated incubation shifts the distribution’s mass only on the observed variable.

Why this should disturb you

Three implications worth sitting with.

First, the magnitude. A Sharpe of 0.78 is not a marginal signal. It is the kind of result that would convince an allocator, a risk committee, or your own past self. And it is generated by pure noise.

Second, the asymmetry of error. The honest process has unbiased estimates in both windows. The simulated process is unbiased only in true OOS — and you will never observe true OOS until after you’ve committed real capital. By the time the truth arrives, the strategy is in production.

Third, the simulation is conservative. I assumed the only contamination was joint optimization on IS + INC. Real simulated incubation also includes the four channels of Part 1 (regime awareness, publication selection, retroactive design choices, no commitment), which compound the bias. The 0.78 figure is a lower bound on what you should expect to see in practice.

Connections to the literature

The phenomenon is not new; rigorous treatments are scattered across the literature. Four references I consider essential:

López de Prado, Advances in Financial Machine Learning (2018), Chapters 11–13. Formalizes backtest overfitting and introduces Combinatorial Purged Cross-Validation as an alternative to standard walk-forward when one wants to use the full dataset. The Ch. 11 discussion of “pseudo-OOS” is the most direct commentary on what I call simulated incubation.

Bailey & López de Prado, “The Deflated Sharpe Ratio” (JPM 2014). Provides the formula to correct observed Sharpe for the number of trials and return non-normality:

where grows with . A strategy “passing” simulated incubation with Sharpe 1.0 after 50 trials often has a negative Deflated Sharpe.

Bailey, Borwein, López de Prado & Zhu, “The Probability of Backtest Overfitting” (Journal of Computational Finance 2017). Introduces PBO, a bootstrap estimate of the probability that a rank-1 in-sample strategy is below median out-of-sample. Directly applicable to diagnose selection risk in simulated-incubation procedures.

Harvey, Liu & Zhu, “…and the Cross-Section of Expected Returns” (RFS 2016). Applies multiple-testing-corrected significance thresholds to factors published in the literature. The message extends naturally to private backtesting: when trial count is large, the evidence threshold to declare alpha is much higher than the conventional .

Honorable mention: Leinweber, “Stupid Data Miner Tricks” (2007) — the most entertaining paper ever written on financial data mining, and a useful reminder that these traps have existed as long as the industry has.

Up next — Part 3: Five Defensible Alternatives. If simulated incubation fails this badly, what do you actually do when real incubation isn’t feasible? I cover walk-forward with pre-registration, Combinatorial Purged CV, Deflated Sharpe, PBO, and the underrated option of “short but real.”