Commit Graph

9 Commits

Author SHA1 Message Date
dennisthiessen 942a22ce65 feat: grade gate-ablation variants under the hold-to-horizon exit too
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 33s
The ablation judged floors under the target/stop model, but the exit
sweeps point at replacing that exit with a fixed hold — under which the
R:R floor's rationale (bigger payoff at the target) may not apply. Each
ablation row now also carries hold_avg_r / hold_net_avg_r / hold_total_r
(30d hold, initial stop only), so the Phase 3 gate decision can be read
under the exit policy that would actually be used.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 11:34:41 +02:00
dennisthiessen 8750aac6d9 fix: carry action/risk_level onto backtest candidates for the gate ablation
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 57s
Deploy / deploy (push) Successful in 2m24s
_window_setups computed them but _replay_ticker dropped them, so the
ablation's NEUTRAL/tightener checks saw None for every candidate and the
'without confidence floor' / 'without R:R floor' rows collapsed to 0
setups (impossible — removing a floor can only add setups). Regression
test now goes through the real _replay_ticker path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 08:07:27 +02:00
dennisthiessen 29b1a9a28c feat: net-of-cost backtest, gate ablation + time-exit sweeps, longer tails
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 57s
Deploy / deploy (push) Successful in 32s
Phase 1 of the strategy-measurement plan — report-only, no production
trading behavior changes:

- Cost haircut: every bucket/sweep now reports net_avg_r/net_total_r
  alongside gross (COST_PER_SIDE=0.1% of notional, converted to R via
  each setup's stop distance); params carry cost_per_side_pct.
- Gate ablation table: re-qualifies candidates at the current momentum
  cutoff with one floor removed per row (confidence / R:R / NEUTRAL /
  momentum-only) to show which floors earn their keep.
- Time-based exit sweep: hold 5/10/21/30 days with the initial ATR stop,
  exit at the day-N close — the classic momentum implementation, to
  disambiguate the wide-trailing result.
- TP sweep extended to +40/+50%, trailing to 25/30% so the optima are
  interior instead of starred at the sweep edge.
- BacktestPanel: Net Avg R columns everywhere, gate-ablation and
  time-exit tables, stars now mark best net avg R; stale cached reports
  still render (all new fields optional/guarded).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 07:50:37 +02:00
dennisthiessen ab9ce18809 feat: trailing-stop exit sweep in the backtest
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 32s
Third exit model alongside target-vs-stop and the fixed take-profit. The TP sweep
showed the edge lives in the fat tail (avg R keeps rising as you let winners run),
but a fixed wide target is win-rate-brutal and gives everything back on a reversal.
A trailing stop harvests the tail while protecting gains.

Per setup the replay computes the realized R for several trail widths (3/5/7/10/
15/20%) in a single conservative pass — stop ratchets up via max(initial_stop,
peak*(1-trail)), exit on the pullback or at the horizon close, R vs the initial
risk. Aggregated into a trailing sweep (win rate = share closed in profit, avg R,
total R) over the qualified set and shown as a new table in the Backtest panel.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 17:33:17 +02:00
dennisthiessen c63951ca02 feat: take-profit exit sweep in the backtest (alongside target-vs-stop)
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 59s
Deploy / deploy (push) Successful in 34s
The target-vs-stop model counts a near-miss of a far S/R target as a full loss
and ignores the partial gains you actually bank — so it measures a different
strategy than "scalp the early pop, take +8%". Add a realistic take-profit exit
model next to it (original untouched).

Per setup the replay now also records risk%, whether the stop was hit, the
favourable excursion reachable before the stop (MFE), and the horizon-close move.
From those a fixed-take-profit sweep (4/6/8/10/12/15%) is scored in R: bank +X%
if reached before the stop, else -1R, else the horizon close. Hit rate = how
often +X% was banked (the MFE CDF), so you can pick the EV-optimal TP without
top-ticking fantasy. Shown as a new table in the Backtest panel; the IC,
calibration and momentum sweep are unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 16:56:32 +02:00
dennisthiessen ef523474ad replace EV activation gate with cross-sectional 12-1 momentum ranking
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 26s
The 5-year backtest confirmed the EV gate adds negative value (high threshold =
worst expectancy) and that 12-1 month momentum is the one price signal with a
plausible, right-signed cross-sectional IC (~0.05). So "qualified" now means:
clears the R:R + confidence floors AND the ticker ranks in the top
`min_momentum_percentile` of the universe by 12-1 momentum that week.

- qualification.py: drop expected_value_r / the EV gate; add a momentum-percentile
  gate (duck-typed `momentum_percentile`, only enforced when attached + threshold
  set, else defers to floors). Mirrored in frontend qualification.ts.
- activation config/schema: min_expected_value -> min_momentum_percentile
  (default 80 = top quintile). ActivationSettings, DashboardPage (ranks/【shows】
  momentum instead of EV), and the BacktestPanel sweep follow.
- backtest: rank each ISO week's universe by 12-1 momentum, assign a percentile,
  and qualify the top slice; the sweep now sweeps the percentile cutoff.

Also offload the backtest's per-ticker compute to a worker thread so the heavy
~5y run no longer blocks the API event loop (the "backend offline" flicker).

Production setups don't carry momentum_percentile yet — wiring the scanner to
attach it (a universe momentum-rank step) is the next step; until then the live
gate defers to floors while the backtest measures the momentum selection. 330
backend tests pass; frontend build clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 22:42:24 +02:00
dennisthiessen c34f3cb1a4 redesign activation gate to expected value + make pipelines cron-configurable
Deploy / lint (push) Successful in 9s
Deploy / test (push) Successful in 46s
Deploy / deploy (push) Successful in 28s
Diagnosing "no qualified signals for 5 days": setups were generated but none
qualified. The gate required BOTH a high min_rr (2.0) AND a high
min_target_probability (60), which became contradictory after the Jun-15
probability recalibration — probability already embeds R:R via the 1/(rr+1) ruin
term, so high-R:R targets are inherently low-probability and nothing cleared both.

Gate is now expected value (R): p*rr - (1-p) from the primary target's
probability. R:R and confidence stay as floors; high-conviction / exclude-conflicts
/ min-target-probability become optional tighteners (default off). Defaults:
min_expected_value=0.15, min_rr=1.2, min_confidence=55. EV is only enforced when
computable. Migration 009 clears stored activation_* rows so the new defaults
apply. Backtest sweeps min_expected_value instead of target probability.

Scheduling: pipelines are now cron-configurable in Admin -> Jobs. daily_pipeline
(full, default 0 7 * * *) plus a new light intraday_pipeline (OHLCV + outcome eval,
default hourly US session) that keeps prices/live-R:R current without setup churn.
Fundamentals on its own early weekly cron. Timezone configurable (default
Europe/Berlin). Moving interval->CronTrigger also fixes the restart-deferral bug
where an interval job's countdown resets on every process restart.

319 backend unit tests pass; frontend tsc clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 14:46:38 +02:00
dennisthiessen 050abc6f71 backtest: add min target-probability sweep
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 40s
Deploy / deploy (push) Successful in 26s
Re-applies the activation gate at several min_target_probability thresholds
(60→30, other conditions fixed) over the already-replayed candidates, so the
trade-off between how many setups qualify and their expectancy is visible in one
table — the cheap "optimize" half of Phase 2. Candidates now carry meets_core +
best_prob so the sweep needs no re-replay. New sweep table in BacktestPanel with
the current threshold starred.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 06:13:30 +02:00
dennisthiessen 6df67ad7ae add backtest harness (Phase 1): historical replay + hit-rate & calibration reports
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 35s
Deploy / deploy (push) Successful in 25s
Replays the price-derived engine over stored OHLCV: at each weekly as-of date,
rebuild the setup from bars <= D (no lookahead) and walk the actual forward bars
for the realized outcome. Reports realized hit-rate/expectancy of qualified
setups (and all setups, by direction) plus a probability calibration curve
(predicted target prob vs realized hit rate).

Reuses pure functions throughout; extracted compute_technical_from_arrays /
compute_momentum_from_closes from scoring_service so live and backtest stay in
sync. Runs as a weekly/triggerable 'backtest' job caching the report in a
SystemSetting; GET /backtest/report serves it. Sentiment/fundamentals held
neutral (no point-in-time history) — calibrates the price/S-R/probability machinery.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 20:14:07 +02:00