The replay was CPU-bound and single-core: the earlier asyncio.to_thread offload
kept the API responsive but, because of the GIL, ran on one core. Per-ticker
replay is independent, so fan it out across worker processes (which sidestep the
GIL) for real multi-core speedup.
- New `settings.backtest_workers` (default 4), capped to cpu_count-1 so a core
stays free for the web server.
- Uses a `forkserver` context (workers forked from a clean single-threaded
server — avoids the fork-with-threads deadlock); falls back to `fork`. On
spawn-only platforms (Windows) and for 1-ticker runs it uses the thread path,
so dev/tests are unaffected.
- Worker takes primitive column arrays (cheap to pickle), rebuilds bars, and
returns (candidates, plain-dict signal series) — both picklable across the
process boundary. Bars are still fetched in the event loop (ORM-safe).
- Pool creation is guarded: if the pool can't start, the job falls back to the
sequential thread path instead of failing.
334 backend tests pass (parallel path is POSIX/server-only, so it's covered by
construction + the picklability/worker-count tests; the thread fallback is
exercised by the run_backtest smoke test).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 5-year backtest confirmed the EV gate adds negative value (high threshold =
worst expectancy) and that 12-1 month momentum is the one price signal with a
plausible, right-signed cross-sectional IC (~0.05). So "qualified" now means:
clears the R:R + confidence floors AND the ticker ranks in the top
`min_momentum_percentile` of the universe by 12-1 momentum that week.
- qualification.py: drop expected_value_r / the EV gate; add a momentum-percentile
gate (duck-typed `momentum_percentile`, only enforced when attached + threshold
set, else defers to floors). Mirrored in frontend qualification.ts.
- activation config/schema: min_expected_value -> min_momentum_percentile
(default 80 = top quintile). ActivationSettings, DashboardPage (ranks/【shows】
momentum instead of EV), and the BacktestPanel sweep follow.
- backtest: rank each ISO week's universe by 12-1 momentum, assign a percentile,
and qualify the top slice; the sweep now sweeps the percentile cutoff.
Also offload the backtest's per-ticker compute to a worker thread so the heavy
~5y run no longer blocks the API event loop (the "backend offline" flicker).
Production setups don't carry momentum_percentile yet — wiring the scanner to
attach it (a universe momentum-rank step) is the next step; until then the live
gate defers to floors while the backtest measures the momentum selection. 330
backend tests pass; frontend build clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two changes so the cross-sectional signal results can actually be trusted.
(a) History depth — the binding constraint. Ingestion defaulted to 365 days, so
long-lookback factors (12-month momentum, 52-week high) were only computable on a
handful of weeks at the tail, and every IC reflected a single market regime.
- New `settings.ohlcv_history_days` (default 1825 ≈ 5y); new tickers backfill this
far instead of 1 year.
- New manual "data_backfill" job (Admin → Jobs) re-fetches the full window for
every ticker, ignoring incremental resume — run once to deepen existing
1-year histories. Idempotent (upsert); resumes after rate limits.
(b) Factor-IC honesty. The IC was averaged over weekly rebalances whose 30-day
forward windows overlap, inflating the t-stat ~sqrt(6)x.
- IC now measured on NON-OVERLAPPING windows (weeks thinned to ~HORIZON apart).
- Each signal carries a `reliable` flag (>= 12 independent windows); BacktestPanel
greys out and de-stars thin signals so a lucky 9-week IC of 0.3 can't masquerade
as an edge.
332 backend tests pass; frontend build clean. No migration (config + job + an
added JSON field on the cached backtest report).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The per-setup hit-rate report can't tell whether a signal predicts returns —
only how a target/stop structure built on one performs. This adds a
cross-sectional factor-IC pass: each week the universe is ranked by a price-only
signal and graded by its rank correlation (Spearman IC) and top-minus-bottom-
quintile spread against the forward 30-day return.
Candidate signals (point-in-time from price; sentiment/fundamentals have no
history in the replay): 12-1/6-1/3-1 month momentum, 1-month reversal,
price-vs-200d SMA, proximity to the 52-week high (George/Hwang), and 126-day
realized volatility (low-vol anomaly).
Reuses the existing per-ticker replay loop (no new data, no second DB pass);
results land in the cached backtest_report as `signal_eval` and render as a
"Signal edge" table in BacktestPanel beside the calibration curve.
330 backend tests pass (10 new in test_signal_eval); frontend build clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Diagnosing "no qualified signals for 5 days": setups were generated but none
qualified. The gate required BOTH a high min_rr (2.0) AND a high
min_target_probability (60), which became contradictory after the Jun-15
probability recalibration — probability already embeds R:R via the 1/(rr+1) ruin
term, so high-R:R targets are inherently low-probability and nothing cleared both.
Gate is now expected value (R): p*rr - (1-p) from the primary target's
probability. R:R and confidence stay as floors; high-conviction / exclude-conflicts
/ min-target-probability become optional tighteners (default off). Defaults:
min_expected_value=0.15, min_rr=1.2, min_confidence=55. EV is only enforced when
computable. Migration 009 clears stored activation_* rows so the new defaults
apply. Backtest sweeps min_expected_value instead of target probability.
Scheduling: pipelines are now cron-configurable in Admin -> Jobs. daily_pipeline
(full, default 0 7 * * *) plus a new light intraday_pipeline (OHLCV + outcome eval,
default hourly US session) that keeps prices/live-R:R current without setup churn.
Fundamentals on its own early weekly cron. Timezone configurable (default
Europe/Berlin). Moving interval->CronTrigger also fixes the restart-deferral bug
where an interval job's countdown resets on every process restart.
319 backend unit tests pass; frontend tsc clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Re-applies the activation gate at several min_target_probability thresholds
(60→30, other conditions fixed) over the already-replayed candidates, so the
trade-off between how many setups qualify and their expectancy is visible in one
table — the cheap "optimize" half of Phase 2. Candidates now carry meets_core +
best_prob so the sweep needs no re-replay. New sweep table in BacktestPanel with
the current threshold starred.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replays the price-derived engine over stored OHLCV: at each weekly as-of date,
rebuild the setup from bars <= D (no lookahead) and walk the actual forward bars
for the realized outcome. Reports realized hit-rate/expectancy of qualified
setups (and all setups, by direction) plus a probability calibration curve
(predicted target prob vs realized hit rate).
Reuses pure functions throughout; extracted compute_technical_from_arrays /
compute_momentum_from_closes from scoring_service so live and backtest stay in
sync. Runs as a weekly/triggerable 'backtest' job caching the report in a
SystemSetting; GET /backtest/report serves it. Sentiment/fundamentals held
neutral (no point-in-time history) — calibrates the price/S-R/probability machinery.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>