The research platform.
Standards that rely on willpower do not survive contact with a promising result. Ours are built into the platform. The software is written so that the careful path is the one it makes easiest to take, and so that any number it produces can be traced back to the exact work that made it. A number that cannot be traced that way is treated as no result at all.
Correctness first, and correctness that lasts.
Underneath is one system rather than a drawer of scripts that happen to agree. It rests on a handful of invariants, each enforced at the level where it cannot be worked around, and each covered by a test that breaks the build the moment it slips.
Reproducible by construction
Every result is bound to the exact inputs that produced it: the configuration, the code revision, the data snapshot, the random seed, and a fingerprint of the environment. Artifacts are content-addressed, so a result cannot be edited after the fact or mistaken for another. A result without this provenance is not counted as one.
Deterministic end to end
The same inputs return the same outputs across runs and machines, down to the bit. Every source of randomness draws from one recorded seed. We test for this directly instead of assuming it, and a run that will not reproduce is filed as a defect, never as a finding.
Causal by architecture
The data layer is point-in-time: the information available at a moment is the information that existed then, and nothing from the future can reach back. A decision made on one interval can only act on the next. No-lookahead lives in the code paths, guarded by tests that break the moment the boundary is crossed.
One evaluation engine
Strategies are scored by a single engine, the same way, every time. There is no per-strategy scoring code and no path by which a researcher can shape how their own work is judged. Execution is modelled as it happens — costs, slippage, position and risk limits, session boundaries — so nothing survives on rounded-away frictions.
Two implementations, one answer
A fast, vectorized, hardware-accelerated evaluator runs next to a slower reference whose whole purpose is to be plainly correct. Tests hold the two to bit-identical agreement on every commit, and a fingerprint of the hardware and libraries makes that agreement prove itself again whenever the environment changes.
Same inputs, same outputs, every run.
No computation is allowed to see the future.
A decision can act only on the interval after it.
One scorer, with no per-strategy exceptions.
The reported window plays no part in any selection.
A gate that cannot confirm integrity halts the run.
The hard part is measurement, not modeling.
Most strategies that look good in research fail in production for procedural reasons: a number chosen by looking at it, a boundary crossed without anyone noticing, a wide search whose breadth was never accounted for. The platform's main job is to put those mistakes out of easy reach.
Integrity gates that fail closed
Automated gates guard the boundaries that matter: data hygiene, temporal ordering, and the separation between research data and held-out data. When a gate cannot confirm that a boundary held, the run stops. Nothing continues on a warning, and there is no quick override waiting for whoever is impatient to move on.
Selection separated from reporting
Every choice that could flatter a result is made on data the final report never sees. What advances, which checkpoint is kept, which thresholds are used — all of it is decided on held-out selection data. Evaluations are pre-registered, repeated across independent seeds, and discounted for the breadth of the search behind them. A wider search clears a higher bar.
Walk-forward, then forward in time
The same machinery that evaluates a strategy on history runs it forward, unattended, at realistic costs. When a candidate crosses from research into live simulation, the measurement stays the same and only the calendar moves. Each step toward capital adds scrutiny.
The platform earns us one thing, and it is the only thing we built it for: the right to believe our own results.