fix probability over-confidence: model target-before-stop, not just touch
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 35s
Deploy / deploy (push) Successful in 24s

Backtest (32k setups) showed the touch-only probability model was ~2x
over-confident — predicted 70% hit 39%, predicted 88% hit 46% — because it
ignored the competing stop. estimate_probability now multiplies the reach
probability (touch within horizon) by the two-barrier gambler's-ruin ratio
1/(R:R+1) = P(target before stop). A 3:1 setup now reads ~25% base, not ~70%,
which lines up with realized rates. Strength/alignment modulation unchanged.

Recalibrates every probability and the EV ranking; the min_target_probability
gate threshold now means roughly what it says. Re-run the backtest to confirm
the calibration table flattens toward the diagonal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-15 20:52:09 +02:00
parent b00e482258
commit 9d2e1e74bf
2 changed files with 23 additions and 12 deletions
+19 -9
View File
@@ -357,22 +357,32 @@ class ProbabilityEstimator:
direction: str,
config: dict[str, float],
) -> float:
"""Probability the target is reached within the outcome horizon.
"""Probability the target is hit BEFORE the stop, within the horizon.
Base = probability of price *touching* a level at the target's distance
within the evaluation window, under a driftless random walk (reflection
principle): 2·(1 Φ(d / (ATR·√T))). Distance is in ATR multiples and T
is the horizon in trading days, so a far target is inherently unlikely —
no more 90% on a +39% move. Strength and signal alignment (drift toward
the target) then modulate it modestly.
Two factors (backtest-calibrated 2026-06-15 — the old touch-only model
was ~2× over-confident because it ignored the competing stop):
reach = P(price touches the target within T) — driftless random walk,
reflection principle: 2·(1 Φ(d / (ATR·√T))). Falls with
distance, so a far target is inherently unlikely.
ruin = P(target before stop | both reachable) — the two-barrier
gambler's-ruin ratio stop/(target+stop) = 1/(R:R + 1). A 3:1
setup wins the race ~25% of the time, not ~70%.
base = reach · ruin. Strength and signal alignment (drift toward target)
then modulate it.
"""
strength = float(target.get("sr_strength", 50.0))
atr_multiple = float(target.get("distance_atr_multiple", 1.0))
rr = float(target.get("rr_ratio", 0.0))
expected_move_atr = math.sqrt(_TARGET_HORIZON_DAYS) # ≈ 5.48 ATR over 30d
z = atr_multiple / expected_move_atr if expected_move_atr > 0 else 99.0
touch_prob = 2.0 * (1.0 - _norm_cdf(z)) # 0..1
probability = touch_prob * 100.0
reach = 2.0 * (1.0 - _norm_cdf(z)) # 0..1, P(touch target in horizon)
# P(target before stop): stop distance / (target + stop) = 1/(rr+1).
# Without a known rr (e.g. isolated probability checks), assume an even race.
ruin = 1.0 / (rr + 1.0) if rr > 0 else 0.5
probability = reach * ruin * 100.0
technical = float(dimension_scores.get("technical", 50.0))
momentum = float(dimension_scores.get("momentum", 50.0))