Commit Graph

110 Commits

Author SHA1 Message Date
dennisthiessen 243e369e9a feat: robustness stats + dynamic recommendation; retire settled report sections
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 54s
Deploy / deploy (push) Successful in 32s
Robustness (answers 'is the edge just outliers?'):
- _bucket_stats gains median_net_r, profit_factor, and net_avg_r_ex_top5
  (expectancy with the top 5% of winners removed); shown as stat tiles.
- Portfolio sim gains per-calendar-year returns, shown in the sim table.

Dynamic recommendation ('What this backtest recommends' panel):
- _build_recommendation derives advice from the report's own numbers on
  every run — exit policy (target vs best hold, with sim CAGRs), which
  gate floors earn their keep (ablation Hold column), best momentum
  cutoff, book-vs-SPY verdict, and an outlier-dependence warning when
  the trimmed expectancy goes non-positive.

Retired (conclusions reached, tables removed from report + UI):
- Take-profit sweep (no interior optimum — fixed TP is the wrong tool
  for momentum), trailing sweep (converged to the hold-to-horizon exit),
  probability calibration (model is display-only by decision).
- _tp_primitives slimmed to _risk_and_stop_day; trailing machinery gone.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 12:33:22 +02:00
dennisthiessen 0f43e755f4 feat: portfolio simulation + per-trade stats (gaps, hold time, best/worst)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 38s
Per-trade additions to the report:
- Gap-through-stop fills: stops now fill at the worse of the stop or the
  bar's open across every exit model (target, TP, trailing, time), so a
  loss can exceed -1R; targets never fill better than their level.
- best_r / worst_r, avg holding days, and net R per day of capital
  deployed on the summary buckets and the time-exit sweep.

Portfolio simulation (the stats a per-setup replay cannot give):
- One capital-constrained book over the qualified setups: 10k start, max
  10 concurrent positions (one per ticker, best momentum first), 1%
  fixed-fractional risk with a 20% no-leverage notional cap, entries at
  the detection close, 0.1%/side costs, daily mark-to-market.
- Two exit policies compared: S/R target race vs hold-to-horizon.
- Equity-curve stats: final equity, total return, CAGR, max drawdown,
  annualized daily Sharpe, win rate, avg P&L, best/worst trade, avg
  hold, entries skipped on a full book, and SPY price return over the
  same window (benchmark history refreshed to cover the replay span).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 11:56:29 +02:00
dennisthiessen 942a22ce65 feat: grade gate-ablation variants under the hold-to-horizon exit too
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 33s
The ablation judged floors under the target/stop model, but the exit
sweeps point at replacing that exit with a fixed hold — under which the
R:R floor's rationale (bigger payoff at the target) may not apply. Each
ablation row now also carries hold_avg_r / hold_net_avg_r / hold_total_r
(30d hold, initial stop only), so the Phase 3 gate decision can be read
under the exit policy that would actually be used.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 11:34:41 +02:00
dennisthiessen 8750aac6d9 fix: carry action/risk_level onto backtest candidates for the gate ablation
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 57s
Deploy / deploy (push) Successful in 2m24s
_window_setups computed them but _replay_ticker dropped them, so the
ablation's NEUTRAL/tightener checks saw None for every candidate and the
'without confidence floor' / 'without R:R floor' rows collapsed to 0
setups (impossible — removing a floor can only add setups). Regression
test now goes through the real _replay_ticker path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 08:07:27 +02:00
dennisthiessen 29b1a9a28c feat: net-of-cost backtest, gate ablation + time-exit sweeps, longer tails
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 57s
Deploy / deploy (push) Successful in 32s
Phase 1 of the strategy-measurement plan — report-only, no production
trading behavior changes:

- Cost haircut: every bucket/sweep now reports net_avg_r/net_total_r
  alongside gross (COST_PER_SIDE=0.1% of notional, converted to R via
  each setup's stop distance); params carry cost_per_side_pct.
- Gate ablation table: re-qualifies candidates at the current momentum
  cutoff with one floor removed per row (confidence / R:R / NEUTRAL /
  momentum-only) to show which floors earn their keep.
- Time-based exit sweep: hold 5/10/21/30 days with the initial ATR stop,
  exit at the day-N close — the classic momentum implementation, to
  disambiguate the wide-trailing result.
- TP sweep extended to +40/+50%, trailing to 25/30% so the optima are
  interior instead of starred at the sweep edge.
- BacktestPanel: Net Avg R columns everywhere, gate-ablation and
  time-exit tables, stars now mark best net avg R; stale cached reports
  still render (all new fields optional/guarded).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 07:50:37 +02:00
dennisthiessen 84ce7c5c26 docs: strategy status + maintainer guide in README; document CI/CD deploy
Adds the validated-vs-not verdict table, the iron rule for strategy
changes, ranked next experiments, a maintainer guide (invariants, file
map, verification, roadmap), and corrects the deploy docs: deploys are
automated by Gitea Actions (push to main = deploy), service is
signalplatform.service at /opt/signalplatform.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 07:50:21 +02:00
dennisthiessen da0bb3367e feat: company names for tickers (Alpaca backfill + subtle display)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 32s
Store an optional company name on Ticker (migration 014) and backfill it from
Alpaca's asset list in a single Trading-API call for the whole universe — no
per-ticker fetch. Runs automatically at the end of universe bootstrap and via a
manual "Backfill Names" button (admin) / POST /admin/tickers/backfill-names.

The name ships on /tickers; a shared symbol→name map (useTickerNames) lets any view
show it without its own request. Displayed subtly next to the symbol — in the global
search, the ticker header, and as a small muted line under the symbol in Top Setups
and Open Trades (no extra column, truncated so it never widens the table).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 10:50:40 +02:00
dennisthiessen a9f4686157 fix: forward sentiment-adjustment fields in the scores API response
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 54s
Deploy / deploy (push) Successful in 31s
get_score computed base_score / sentiment_score / sentiment_adjustment /
max_sentiment_adjustment, but the router's _map_composite_breakdown built the
response model from only the five original keys and silently dropped the rest — so
the API always returned null for them. That's why the ticker page showed neither the
"Composite = Base + Sentiment" caption nor the ± marker on the sentiment row despite
the frontend and scoring service both supporting it. Pass the fields through, with a
guard test so they can't be dropped again.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 10:04:06 +02:00
dennisthiessen 94ed3207d7 feat: show composite = base + sentiment caption under the Standing matrix
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 52s
Deploy / deploy (push) Successful in 30s
The ticker page renders the composite via the Standing matrix (ScoreCard runs with
showComposite=false), so the "Base X · sentiment +Y" line in the ScoreCard header
was never visible there. Add a compact caption beneath the matrix — "Composite 83 =
Base 78 + Sentiment 5.0" — shown only when sentiment actually moves the score, so
the composition of the number has a visible home where the number lives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 09:55:42 +02:00
dennisthiessen 5442b62495 fix: decouple the sentiment weight from the base mix in the weights form
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 50s
Deploy / deploy (push) Successful in 31s
Sentiment is now a signed adjustment (± points on top of the base), not part of
the averaged dimensions — but the weights form still squeezed all five sliders to
sum to 100%, so dragging sentiment rebalanced the base and you couldn't set a clean
±N. Now the four base dimensions normalize among themselves (share shown as %),
and sentiment is its own "influence (± points)" control passed through raw
(slider 10 → weight 0.10 → ±10), independent of the base.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 09:43:43 +02:00
dennisthiessen f61e11adea feat: sentiment as a signed adjustment to the composite, not averaged in
Deploy / lint (push) Successful in 23s
Deploy / test (push) Successful in 54s
Deploy / deploy (push) Successful in 31s
Going from no sentiment to a bullish read used to be able to *lower* the composite:
sentiment was blended into the weighted average as an absolute level, so a bullish
75 diluted a ticker already scoring 78. That's backwards for a directional signal.

Now the non-sentiment dimensions form a re-normalized weighted-average base, and
sentiment is applied as a signed adjustment around neutral (50):

    composite = clamp(base + MAX_ADJ * (sentiment - 50) / 50)
    MAX_ADJ   = sentiment weight * 100   (default weight 0.10 → ±10)

Neutral leaves the base unchanged, bullish adds and bearish subtracts (scaled by
confidence, since a 50%-confidence call maps to 50 → no effect), and no sentiment
never penalises. Default sentiment weight 0.15 → 0.10; the weight now means "max ±
points." Composite breakdown exposes base_score/sentiment_score/sentiment_adjustment,
and the ScoreCard shows "Base 78 · sentiment +5.0" plus the per-dimension adjustment.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 09:34:37 +02:00
dennisthiessen 1566b84379 feat: trailing-stop auto-exit for paper trades + close/digest alerts
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 54s
Deploy / deploy (push) Successful in 33s
Applies the backtest-validated trailing stop to live paper trading, and surfaces
it transparently.

Exit (A):
- New paper-trade exit policy (paper_exit_mode=trailing, paper_trailing_pct=12),
  tunable in Admin → Paper-Trade Exit. resolve_open_trades runs a trailing stop
  (initial stop as floor, ratchets up from the peak; target ignored — the
  validated rule) and records close_reason (trailing|stop|target|manual; +migration
  013).
- list_trades enriches open trades with the live trailing-stop level + distance %.
  Open Trades panel shows the active tactic and a Trail Stop column.

Alerts (B):
- Daily digest now lists open trades with unrealized gain, trailing stop, and how
  far away it is.
- New "trade closed" alert: one summary per auto-close (trailing/target/stop, not
  manual) — direction, reason, days held, P&L abs+%/R — covering wins AND
  stop-loss losses. Deduped by trade id; toggle in Admin alerts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 18:48:05 +02:00
dennisthiessen ab9ce18809 feat: trailing-stop exit sweep in the backtest
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 32s
Third exit model alongside target-vs-stop and the fixed take-profit. The TP sweep
showed the edge lives in the fat tail (avg R keeps rising as you let winners run),
but a fixed wide target is win-rate-brutal and gives everything back on a reversal.
A trailing stop harvests the tail while protecting gains.

Per setup the replay computes the realized R for several trail widths (3/5/7/10/
15/20%) in a single conservative pass — stop ratchets up via max(initial_stop,
peak*(1-trail)), exit on the pullback or at the horizon close, R vs the initial
risk. Aggregated into a trailing sweep (win rate = share closed in profit, avg R,
total R) over the qualified set and shown as a new table in the Backtest panel.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 17:33:17 +02:00
dennisthiessen c5f6b07a3e feat: extend take-profit sweep into the tail + clarify it ignores the target
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 55s
Deploy / deploy (push) Successful in 33s
Avg R was still rising at the previous top level (+15%), so the optimum was off
the table. Extend TP_LEVELS to 20/25/30% to reveal where letting winners run
stops paying (it plateaus toward "just hold to the horizon close").

Also clarify in the panel that the take-profit model deliberately does NOT use
the setup's S/R target — it's a standalone fixed-% exit; exiting at the target is
the target-vs-stop model above. The two are complementary ends, not in conflict.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 17:14:54 +02:00
dennisthiessen c63951ca02 feat: take-profit exit sweep in the backtest (alongside target-vs-stop)
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 59s
Deploy / deploy (push) Successful in 34s
The target-vs-stop model counts a near-miss of a far S/R target as a full loss
and ignores the partial gains you actually bank — so it measures a different
strategy than "scalp the early pop, take +8%". Add a realistic take-profit exit
model next to it (original untouched).

Per setup the replay now also records risk%, whether the stop was hit, the
favourable excursion reachable before the stop (MFE), and the horizon-close move.
From those a fixed-take-profit sweep (4/6/8/10/12/15%) is scored in R: bank +X%
if reached before the stop, else -1R, else the horizon close. Hit rate = how
often +X% was banked (the MFE CDF), so you can pick the EV-optimal TP without
top-ticking fantasy. Shown as a new table in the Backtest panel; the IC,
calibration and momentum sweep are unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 16:56:32 +02:00
dennisthiessen 6511a1020b feat: exclude NEUTRAL setups from the activation gate (default on)
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 56s
Deploy / deploy (push) Successful in 34s
A NEUTRAL ("No Clear Setup") recommendation means the engine found no clear
directional trade, yet such setups could still qualify and even be crowned the
top pick purely on momentum rank (e.g. an extended momentum leader with a far,
5%-probability target). A NEUTRAL signal isn't actionable, so it shouldn't
qualify.

New `exclude_neutral` activation flag (default on): setup_qualifies drops setups
whose recommended_action is NEUTRAL. It lives in the shared gate, so it flows
through the dashboard's qualified/top-pick selection, the track record's
qualified stats, and the backtest (which computes recommended_action and gates on
meets_core). Toggleable in Admin → Settings → Activation; the frontend mirror and
activationSummary ("directional") match.

Re-run the backtest after enabling to confirm it holds/improves expectancy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 15:19:07 +02:00
dennisthiessen 20a1c143f3 fix: surface empty OHLCV fetch as a warning, not success
Deploy / lint (push) Successful in 8s
Deploy / test (push) Successful in 1m25s
Deploy / deploy (push) Successful in 46s
Fetching a symbol the provider doesn't cover (e.g. RHM/Rheinmetall — Alpaca
serves US listings only) returned 0 bars but reported "complete · Successfully
ingested 0 records", which the UI showed as green success.

fetch_and_ingest now returns a distinct `no_data` status when the provider
returns nothing AND the ticker has no history (vs. "already up to date" when bars
exist). The fetch endpoint maps it to a `warning` source status, and the fetch
toast renders it as ⚠ with the provider message instead of success.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 19:27:41 +02:00
dennisthiessen 6c2e45377c feat: collapse track record into a live-vs-backtest check
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 57s
Deploy / deploy (push) Successful in 34s
The outcome section measures the same thing as the backtest with the same code
and data — its only unique value is catching when the live system drifts from
the backtest (a bug, config/data drift, or look-ahead). So reframe it as exactly
that: a one-line "Live X R vs Backtest Y R · n matured · tracking ✓ / drift ⚠"
indicator (like-for-like with the qualified toggle), with the stat cards and
By-Action/By-Confidence tables moved into a collapsed "Outcome details"
disclosure. Drop the always-empty By-Direction table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 13:58:15 +02:00
dennisthiessen 7e9a6cd7ec fix: only count matured setups in the live track record
Deploy / lint (push) Successful in 8s
Deploy / test (push) Successful in 1m1s
Deploy / deploy (push) Successful in 36s
The outcome stats were dominated by quick stop-outs: near stops resolve as losses
within days while far targets take weeks, so a young sample (mostly pending,
0 expired) skewed sharply negative (e.g. 13.8% hit / -0.46R vs the backtest's
35.8% / +0.18R) — a maturation artifact, not a real result.

get_performance_stats now counts only setups whose full ~30-day window has
elapsed (_MATURITY_DAYS), so winners had as long as losers (unbiased, and
comparable to the backtest). A new `maturing` count reports the younger setups
held back. The Track Record UI relabels "Evaluated" -> "Matured", shows the
maturing count, and explains the window in the empty state + methodology note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 13:41:48 +02:00
dennisthiessen 8bcbbfcfd0 fix: show benchmark job in admin; harden + split deploy workflow
Deploy / lint (push) Successful in 8s
Deploy / test (push) Successful in 48s
Deploy / deploy (push) Successful in 28s
- admin_service: register benchmark_collector in VALID_JOB_NAMES, JOB_LABELS and
  PIPELINE_MEMBERS. The Admin → Jobs list is built from these hardcoded sets, not
  the scheduler, so the job was registered but invisible/untriggerable.

- deploy.yml:
  - SSH: verify the host key (StrictHostKeyChecking=yes) now that known_hosts is
    supplied; move private-key cleanup to an `if: always()` step.
  - Add a concurrency guard so deploys serialize.
  - Health-check the service after restart (127.0.0.1:8998/api/v1/health).
  - Align CI Python to 3.12 (matches prod); pip + npm caching.
  - Clarify the Postgres service only validates migrations (tests use SQLite);
    drop the redundant DATABASE_URL from the pytest step.
  - Split the monolithic "Deploy to server" step into named steps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 09:01:09 +02:00
dennisthiessen 0627787bfc fix(alembic): renumber benchmark migration 011 -> 012
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 46s
Deploy / deploy (push) Successful in 27s
011 collided with the existing 011_add_regime_snapshots (duplicate revision id
and a second head branching off 010), which broke `alembic upgrade head`. Chain
the benchmark_prices migration after regime_snapshots so the history is linear
again (010 -> 011 regime_snapshots -> 012 benchmark_prices, single head).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 08:47:36 +02:00
dennisthiessen 30effa89b7 feat: ticker search, watchlist momentum column, alpha vs S&P 500
Deploy / lint (push) Successful in 6s
Deploy / test (push) Failing after 12s
Deploy / deploy (push) Has been skipped
Three usability fixes:

1. Global ticker search in the sidebar (TickerSearch) — typeahead over the
   tracked universe that opens a ticker's detail page without adding it to the
   watchlist. Also wired into the mobile nav.

2. Watchlist table shows the ticker's 12-1 momentum percentile (the top-pick
   selector) instead of the noisy full S/R-level list. Enriched from the setup
   already loaded in watchlist_service._enrich_entry — no extra query.

3. Alpha vs the S&P 500 on paper trades (open + closed). New benchmark_prices
   table + benchmark_service store SPY daily closes (a standalone series, not a
   Ticker, so it never enters the scanner / momentum ranking / rankings) via a
   new daily-pipeline step. paper_trade_service computes per-trade
   benchmark_return / alpha_pct / alpha_usd over each holding period; the open-
   trades table, dashboard, and closed-trades panel surface per-trade and total
   alpha. The list read path never makes a provider call.

Deploy: alembic upgrade head, then run the benchmark/daily job once to populate
SPY closes (alpha shows "—" until then).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 08:44:40 +02:00
dennisthiessen 4a96f85cd9 feat: Standing matrix on the ticker page (quality x momentum verdict)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 26s
Replace the scattered score readouts with one hero: a quality (composite) x
momentum-percentile scatter that plots this ticker against the whole field and
reads out a verdict by quadrant — Strong Buy / Momentum / Accumulate / Pass. The
dashed divider is the activation gate's momentum percentile, so "above the line =
qualifies" is visible at a glance; peers are clickable. Reuses the regime-quadrant
visual language and is lazy-loaded so recharts stays out of the main ticker chunk.

- New StandingMatrix component (composite x momentum, field cloud, verdict).
- ScoreCard gains showComposite (default true); the ticker page now renders it
  without the composite ring (composite lives in the matrix) under "Dimensions".
- Confidence + target probability stay in the recommendation panel (the trade).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:00:47 +02:00
dennisthiessen 146dadf06f docs: refresh README and document how it works
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 43s
Deploy / deploy (push) Successful in 25s
- Add "How It Works": daily load (ordered pipeline steps), intraday flow,
  other jobs, and the score -> activation gate -> top pick chain.
- Add "Key Use Cases" (find today's best long setup; track a paper trade).
- Fix stale facts: user-curated (not auto) watchlist, actual routes/pages,
  scheduler description, wrong env defaults (RR 3.0->1.5, fundamentals
  daily->weekly).
- Add missing surface: paper trading, activation gate, market regime, Telegram
  alerts, backtest; API groups (paper-trades, market/regime, jobs); FRED +
  Telegram env vars; note that pipeline timing is admin-cron, not env.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 16:12:59 +02:00
dennisthiessen d15acb8741 feat: top-pick and open-trade status labels on the ticker page
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 25s
Two read-only pills in the ticker header, beside the watchlist toggle:
- "Top Pick" when the ticker is the current #1 — the same ranking the dashboard
  highlights, via a shared topPickSymbol() helper so the two stay in sync.
- "Open Trade" when an open paper trade exists on the ticker.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 16:04:55 +02:00
dennisthiessen 2f21c685e8 feat: always-fresh sentiment for top picks, watchlist & open trades
Tiered, uncapped sentiment scope so the names that matter are never shown
without sentiment.

- Priority (always fully refreshed): top-pick feeders — momentum leaders with a
  tradeable long setup over the R:R floor (the tickers that are, or could become
  with positive sentiment, the dashboard top pick) — plus the curated watchlist
  and open paper trades.
- Filler: top-N by composite, a discovery net, fetched after the priority set so
  a mid-run rate limit lands the important names first.
- Removed the per-run cap (sentiment_max_per_run): the relevant set is naturally
  bounded (watchlist <= 20, composite <= top_composite), so a full refresh stays
  inside the free tier. extra="ignore" keeps a stale env var from breaking startup.
- Refresh window 72h -> 120h (5 days): sentiment shifts slowly, score window is 7d.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 15:59:58 +02:00
dennisthiessen 65dd53baa3 feat: Telegram alert on regime quadrant change (hysteresis + cooldown)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 27s
Fires once when the regime monitor shifts quadrant (regime index x early
warning), so you don't have to watch the tab. Two guards against spam:

- Hysteresis: each axis only flips once the value crosses its divider by a
  margin, so a point parked on a boundary keeps its quadrant instead of
  flip-flopping day to day.
- Cooldown: a genuine change stays quiet for a few days after the last alert.

Seeds the baseline silently on first run; reuses the existing Telegram dispatch
+ AlertLog. New per-trigger toggle in Admin → Alerts (on by default).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 19:05:01 +02:00
dennisthiessen e683513857 fix: smooth the quadrant trail and fade it by recency
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 25s
The single solid trail line read like a tangle. Make older→newer legible: the
path now fades from muted slate (older) to bright blue (newer) via per-point
colors, the connecting line is faint, and the points are de-noised with a
centered moving average (today kept exact). Easier to see the direction of travel
through the quadrants.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:22:30 +02:00
dennisthiessen a07bfee6e6 feat: regime quadrant plot in place of the combined gauge
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 25s
The combined score collapsed two distinct signals into one not-very-meaningful
number. Replace its gauge with a quadrant scatter that shows both axes directly:
x = regime index (coincident), y = early warning (breadth divergence), with a
trail of the last 60 sessions and today highlighted.

The four quadrants make the readings legible — ① hot & brittle (narrow melt-up,
shakeout risk), ② transition, ③ healthy & broad, ④ real downturn — and the trail
surfaces the actual tell: the ①→④ move (early warning rolling over as the regime
index climbs = divergence resolving downward). Combined still shows as a line in
the score-history chart. Frontend-only; reuses the history endpoint. Lazy-loaded.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:14:32 +02:00
dennisthiessen 66444af65c feat: score-history chart on the regime tab
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 25s
Plots the index, early-warning, and combined scores over time beneath the live
gauges, with a 1M/3M/6M/All range toggle and band reference lines — so the trend
and any divergence between the scores is visible, not just today's snapshot.

- Backend: GET /regime/history + get_regime_history (the three scores per
  snapshot date from regime_snapshots).
- Frontend: recharts line chart, lazy-loaded so recharts ships in its own
  regime-tab chunk instead of nearly doubling the main bundle.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 15:48:42 +02:00
dennisthiessen 60def1155b fix: coverage-aware event-study headline instead of misleading median delta
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 39s
Deploy / deploy (push) Successful in 24s
The "warned a median -18 days later" line was the median-over-1-event trap: the
coincident baseline's 60d median is a single lucky event, while breadth warned on
7. Replace it with the honest coverage framing (7/11 vs 1/11) and flag that the
median-lead comparison is unreliable when coverage differs this much.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 15:36:39 +02:00
dennisthiessen 02b8df58f0 fix: populate early-warning/combined on the latest snapshot + recent history
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 24s
The early-warning score showed n/a because it required an exact date match
between the live benchmark (Alpaca, may have today's bar) and the stored
universe breadth (DB, often a day behind), which blanked the newest snapshot —
the one the UI displays.

- Look up the divergence as-of the snapshot date (newest value within a 7-day
  lag) instead of requiring an exact match.
- Backfill early_warning + combined onto recent existing snapshots (the index
  history predates this signal) so the 7/30-day trends populate on the first run
  rather than only filling in over the coming weeks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 15:31:02 +02:00
dennisthiessen 613fc756ec feat: separate live early-warning + combined score on the regime tab
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 40s
Deploy / deploy (push) Successful in 24s
The event study showed the breadth-divergence signal genuinely leads (warned
before 7/11 drawdowns, ~6 weeks median, where the coincident baseline almost
never did). Surface it live to observe before deciding how to embed it — kept
separate from the index, not folded into its weights.

- regime_monitor daily job now computes breadth-divergence live and attaches a
  separate early_warning score plus a combined blend (weighted mean, default
  0.6/0.4, configurable via combined_weights) to each snapshot, including the
  backfill so the 7/30-day trends populate immediately. Stored in breakdown_json
  — no schema change. Best-effort: a breadth failure can't break the index.
- get_regime_monitor returns the index, early_warning, and combined scores each
  with 7/30-day deltas.
- Regime tab shows three gauges (generalized ScoreGauge): coincident index,
  early warning, and a compact combined blend. Stale snapshots render "—".

Note: the daily regime job now also does a universe-wide breadth scan.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 15:23:37 +02:00
dennisthiessen 7c5fb1138d feat: sharpen the event study — more events, fair baseline, per-event view
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 26s
The first run gave only 2 events (N=2 is anecdote, not evidence) and an unfairly
weak coincident baseline, so the +42d lead couldn't be trusted. This makes the
measurement meaningful:

- More, cleaner events: default drawdown threshold 15%→10%, and dedup switched
  from "recover to the high" to a rising-edge + cooldown (40d), so distinct
  drawdowns each register instead of merging.
- Fair comparison: each indicator now warns at its OWN 80th percentile instead of
  a shared absolute 60, removing the artifact that muted the coincident baseline.
- Per-event breakdown (date · depth · breadth lead · coincident lead) so a median
  over a tiny sample can't hide an apples-to-oranges comparison — you see whether
  both warned on the same drawdown.
- Surface precision/recall (best row) + base rate per indicator — the honest edge
  read, not just lead time.

Re-run the Event Study job to regenerate the cached report in the new shape.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 14:54:29 +02:00
dennisthiessen f8d62e4074 feat: show current exposure instead of lifetime stats on the overview
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 43s
Deploy / deploy (push) Successful in 25s
The overview's Hit Rate and Expectancy were static lifetime aggregates — they
barely move day to day and aren't actionable at a glance. Replace them with the
current state from open paper trades:

- Open Risk: total $ at risk to stops across open positions.
- Unrealized: summed unrealized R (mark-to-market), with $ P&L and win/loss count.

Computed in the frontend from the already-loaded open trades (tradePnl) — no
backend change. The detailed lifetime stats remain on Signals → Track Record.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 14:08:59 +02:00
dennisthiessen 824c15cf69 feat: breadth-divergence early-warning indicator + event study
Adds a leading-by-construction candidate and the harness to measure whether it
actually leads regime breaks, before any of it earns weight in the live index.

- breadth_service: % of the stored universe above its own 200-DMA + a divergence
  score (benchmark price up while breadth falls, nudged by low breadth). Genuinely
  leading because it keys on divergence, not level. Not wired into the live score.
- event_study_service: detect drawdown events on the benchmark, then measure each
  indicator's median lead time (event-centered) and precision/recall vs. the base
  rate (signal-centered). Compares breadth-divergence against the deterministic
  coincident price composite (reuses the regime price sub-scores). Price/breadth
  only — reproducible, no LLM/FRED.
- Manual "Event Study" job (Admin → Jobs), GET /regime/event-study, and an
  inline early-warning panel on the Regime tab with an honest small-sample caveat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 14:08:52 +02:00
dennisthiessen ebff19940b feat: add standalone AI/Tech regime-change monitor tab
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 46s
Deploy / deploy (push) Successful in 27s
A new /regime tab scoring how far the AI/Tech bull regime has deteriorated
toward a re-rating as a single 0-100 index with per-signal breakdown and a
7/30-day trend. Intentionally decoupled: nothing reads its output to gate or
score trades — the daily-pipeline membership is scheduling only.

- regime_monitor_service: price sub-scores (P1-P6 via Alpaca, like
  market_regime), VIX + HY credit spreads via a small FRED helper, weighted
  aggregation over available signals (missing source -> n/a, dropped from the
  denominator), one snapshot row/day, and a ~90-day history backfill by
  replaying the already-fetched series as-of each past day.
- F1/F3 fundamentals proposed by the configured grounded LLM (reuses
  sentiment_provider_service config resolution), with a manual override + lock.
- regime_snapshots table (migration 011); endpoints on the existing market
  router; admin-editable weights/threshold; standalone /regime page.

Data needs: prices via Alpaca, VIX/credit via FRED (optional key — signals show
n/a without it). No LLM needed for history.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 11:51:45 +02:00
dennisthiessen 5605915d45 fix: scope sentiment collection to the gate's momentum leaders
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 25s
Qualified setups could carry no sentiment because the sentiment job scoped
its relevant-set to watchlist + open trades + top-N composite score, while
the activation gate qualifies on 12-1 momentum percentile — a different axis.
A top-momentum ticker outside the composite top-N never got sentiment, so the
R:R scan enhanced it as neutral.

Add the gate's momentum leaders (percentile >= activation min_momentum_percentile)
to the sentiment relevant-set so scope tracks the gate. Best-effort: a momentum
or config failure falls back to the base set rather than aborting collection.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 12:06:52 +02:00
dennisthiessen 437ceacfc1 refactor: dedupe scheduler logging/runtime, centralize SystemSetting access, fix rankings N+1
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 42s
Deploy / deploy (push) Successful in 27s
Behavior-preserving cleanup (345 tests pass, ruff clean):

- scheduler: replace 62 inline logger.x(json.dumps({...})) calls with a
  _log_event helper, and collapse 11 identical _job_runtime dicts into an
  _idle_runtime() factory over _JOB_NAMES.
- settings: add app/services/settings_store.py (get_setting/get_value/get_map/
  upsert_setting) and route ~13 hand-rolled SystemSetting queries + two
  identical _settings_map helpers through it.
- scoring.get_rankings: collapse the per-ticker N+1 (3-4 queries + a commit each)
  into 2 bulk reads + a single conditional commit; drop the redundant re-fetch.
  Lazy recompute-on-read is preserved. Adds first tests for get_rankings.

Net ~ -245 lines across the touched modules.
2026-06-24 11:23:39 +02:00
dennisthiessen f48d8705de remove min_target_probability gate + add chart time-range presets
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 39s
Deploy / deploy (push) Successful in 24s
min_target_probability is gone: it filtered on the probability model the
calibration has repeatedly shown to be weak and overconfident, it was redundant
with the momentum gate, and as an off-by-default knob it just invited bad tuning.
Removed from the backend gate, activation config/schema, the frontend mirror
(qualifiesSetup / activationSummary), and ActivationSettings. The probability
model stays where it does real work (primary-target selection + display).

Charts: with multi-year history the all-bars default was unreadable. Added
time-range presets (1M / 3M / 6M / YTD / 1Y / 3Y / 5Y / All), defaulting to 1Y;
clicking a preset always re-applies (snaps back after a manual zoom). Y-axis
autoscale and wheel-zoom / drag-pan were already there.

339 backend tests pass; frontend build clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:24:35 +02:00
dennisthiessen 605f95098c momentum gate: long-only + wire the percentile onto live setups
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 47s
Deploy / deploy (push) Successful in 24s
Part 1 — long-only. The momentum edge is long top-momentum; the gate was
qualifying shorts on high-momentum names (fighting the trend), which showed as
the -0.13R Short(qual.) drag. While the gate is active, shorts no longer qualify
(backend qualification, backtest _momentum_qualifies, and the frontend mirror).

Part 2 — production wiring. Live setups now carry a real momentum rank, so the
dashboard, the Track Record's qualified stats, and outcome evaluation all gate on
the same value instead of deferring to floors:
- new momentum_service.compute_momentum_percentiles: 12-1 momentum per ticker,
  ranked across the universe into a {symbol: percentile} map.
- the daily R:R scan ranks the universe up front and stores each setup's
  percentile (new trade_setups.momentum_percentile column, migration 010).
- enhance_trade_setup mutates the same row, so the percentile is preserved;
  _trade_setup_to_dict + TradeSetupResponse expose it to the API.

Until a fresh scan runs, pre-existing setups have a null percentile and the gate
falls back to floors for them (longs) / excludes them (shorts) — they fill in on
the next scan. 341 backend tests pass; frontend build clean.

Needs the alembic upgrade (migration 010) on deploy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 07:07:38 +02:00
dennisthiessen 7060b9a019 parallelize the backtest across worker processes (true multi-core)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 38s
Deploy / deploy (push) Successful in 25s
The replay was CPU-bound and single-core: the earlier asyncio.to_thread offload
kept the API responsive but, because of the GIL, ran on one core. Per-ticker
replay is independent, so fan it out across worker processes (which sidestep the
GIL) for real multi-core speedup.

- New `settings.backtest_workers` (default 4), capped to cpu_count-1 so a core
  stays free for the web server.
- Uses a `forkserver` context (workers forked from a clean single-threaded
  server — avoids the fork-with-threads deadlock); falls back to `fork`. On
  spawn-only platforms (Windows) and for 1-ticker runs it uses the thread path,
  so dev/tests are unaffected.
- Worker takes primitive column arrays (cheap to pickle), rebuilds bars, and
  returns (candidates, plain-dict signal series) — both picklable across the
  process boundary. Bars are still fetched in the event loop (ORM-safe).
- Pool creation is guarded: if the pool can't start, the job falls back to the
  sequential thread path instead of failing.

334 backend tests pass (parallel path is POSIX/server-only, so it's covered by
construction + the picklability/worker-count tests; the thread fallback is
exercised by the run_backtest smoke test).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 23:20:20 +02:00
dennisthiessen e71c07e554 fix: blank Track Record page when the cached backtest report is pre-momentum
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 24s
The momentum-sweep table read row.min_momentum_percentile.toFixed(), but a report
cached before the EV->momentum change only has min_expected_value rows. undefined
.toFixed() threw during render and — with no error boundary — blanked the whole
Track Record tab. Guard the sweep block on the new field so a stale report just
hides the sweep; re-running the backtest repopulates it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 22:47:36 +02:00
dennisthiessen ef523474ad replace EV activation gate with cross-sectional 12-1 momentum ranking
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 41s
Deploy / deploy (push) Successful in 26s
The 5-year backtest confirmed the EV gate adds negative value (high threshold =
worst expectancy) and that 12-1 month momentum is the one price signal with a
plausible, right-signed cross-sectional IC (~0.05). So "qualified" now means:
clears the R:R + confidence floors AND the ticker ranks in the top
`min_momentum_percentile` of the universe by 12-1 momentum that week.

- qualification.py: drop expected_value_r / the EV gate; add a momentum-percentile
  gate (duck-typed `momentum_percentile`, only enforced when attached + threshold
  set, else defers to floors). Mirrored in frontend qualification.ts.
- activation config/schema: min_expected_value -> min_momentum_percentile
  (default 80 = top quintile). ActivationSettings, DashboardPage (ranks/【shows】
  momentum instead of EV), and the BacktestPanel sweep follow.
- backtest: rank each ISO week's universe by 12-1 momentum, assign a percentile,
  and qualify the top slice; the sweep now sweeps the percentile cutoff.

Also offload the backtest's per-ticker compute to a worker thread so the heavy
~5y run no longer blocks the API event loop (the "backend offline" flicker).

Production setups don't carry momentum_percentile yet — wiring the scanner to
attach it (a universe momentum-rank step) is the next step; until then the live
gate defers to floors while the backtest measures the momentum selection. 330
backend tests pass; frontend build clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 22:42:24 +02:00
dennisthiessen 099846513b deepen OHLCV history + make the factor-IC pass honest about overlap/regime
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 39s
Deploy / deploy (push) Successful in 25s
Two changes so the cross-sectional signal results can actually be trusted.

(a) History depth — the binding constraint. Ingestion defaulted to 365 days, so
long-lookback factors (12-month momentum, 52-week high) were only computable on a
handful of weeks at the tail, and every IC reflected a single market regime.
- New `settings.ohlcv_history_days` (default 1825 ≈ 5y); new tickers backfill this
  far instead of 1 year.
- New manual "data_backfill" job (Admin → Jobs) re-fetches the full window for
  every ticker, ignoring incremental resume — run once to deepen existing
  1-year histories. Idempotent (upsert); resumes after rate limits.

(b) Factor-IC honesty. The IC was averaged over weekly rebalances whose 30-day
forward windows overlap, inflating the t-stat ~sqrt(6)x.
- IC now measured on NON-OVERLAPPING windows (weeks thinned to ~HORIZON apart).
- Each signal carries a `reliable` flag (>= 12 independent windows); BacktestPanel
  greys out and de-stars thin signals so a lucky 9-week IC of 0.3 can't masquerade
  as an edge.

332 backend tests pass; frontend build clean. No migration (config + job + an
added JSON field on the cached backtest report).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 18:20:59 +02:00
dennisthiessen 402025692a add cross-sectional signal evaluation (factor rank-IC) to the backtest
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 40s
Deploy / deploy (push) Successful in 26s
The per-setup hit-rate report can't tell whether a signal predicts returns —
only how a target/stop structure built on one performs. This adds a
cross-sectional factor-IC pass: each week the universe is ranked by a price-only
signal and graded by its rank correlation (Spearman IC) and top-minus-bottom-
quintile spread against the forward 30-day return.

Candidate signals (point-in-time from price; sentiment/fundamentals have no
history in the replay): 12-1/6-1/3-1 month momentum, 1-month reversal,
price-vs-200d SMA, proximity to the 52-week high (George/Hwang), and 126-day
realized volatility (low-vol anomaly).

Reuses the existing per-ticker replay loop (no new data, no second DB pass);
results land in the cached backtest_report as `signal_eval` and render as a
"Signal edge" table in BacktestPanel beside the calibration curve.

330 backend tests pass (10 new in test_signal_eval); frontend build clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 17:58:40 +02:00
dennisthiessen c34f3cb1a4 redesign activation gate to expected value + make pipelines cron-configurable
Deploy / lint (push) Successful in 9s
Deploy / test (push) Successful in 46s
Deploy / deploy (push) Successful in 28s
Diagnosing "no qualified signals for 5 days": setups were generated but none
qualified. The gate required BOTH a high min_rr (2.0) AND a high
min_target_probability (60), which became contradictory after the Jun-15
probability recalibration — probability already embeds R:R via the 1/(rr+1) ruin
term, so high-R:R targets are inherently low-probability and nothing cleared both.

Gate is now expected value (R): p*rr - (1-p) from the primary target's
probability. R:R and confidence stay as floors; high-conviction / exclude-conflicts
/ min-target-probability become optional tighteners (default off). Defaults:
min_expected_value=0.15, min_rr=1.2, min_confidence=55. EV is only enforced when
computable. Migration 009 clears stored activation_* rows so the new defaults
apply. Backtest sweeps min_expected_value instead of target probability.

Scheduling: pipelines are now cron-configurable in Admin -> Jobs. daily_pipeline
(full, default 0 7 * * *) plus a new light intraday_pipeline (OHLCV + outcome eval,
default hourly US session) that keeps prices/live-R:R current without setup churn.
Fundamentals on its own early weekly cron. Timezone configurable (default
Europe/Berlin). Moving interval->CronTrigger also fixes the restart-deferral bug
where an interval job's countdown resets on every process restart.

319 backend unit tests pass; frontend tsc clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 14:46:38 +02:00
dennisthiessen d53b4ffb57 fix admin password reset: send new_password (was password)
Deploy / lint (push) Successful in 6s
Deploy / test (push) Successful in 36s
Deploy / deploy (push) Successful in 24s
The reset endpoint's schema expects new_password; the client sent password,
causing "body.new_password: Field required".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 11:46:06 +02:00
dennisthiessen 9008865d75 run fundamentals weekly, not daily — it's quarterly-ish data
Deploy / lint (push) Successful in 5s
Deploy / test (push) Successful in 36s
Deploy / deploy (push) Successful in 24s
Pulled the fundamental collector out of the daily pipeline (where it re-fetched
near-identical numbers every day and burned free-tier API quota) and made it an
independent weekly job. P/E/market-cap drift with price but the score buckets
them coarsely; revenue growth and earnings surprise only change at quarterly
earnings. Added "weekly" to the frequency map; fundamental_fetch_frequency now
defaults to weekly (configurable).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 11:23:16 +02:00
dennisthiessen e982487abd coordinate jobs: daily pipeline orchestrator runs the flow in order
Deploy / lint (push) Successful in 7s
Deploy / test (push) Successful in 39s
Deploy / deploy (push) Successful in 25s
Jobs were independent 24h timers with no ordering, so the scanner could run on
stale OHLCV, and manual runs desynced the offsets. New daily_pipeline job runs
the data→signal flow in dependency order: OHLCV → fundamentals → sentiment →
R:R scan → outcome eval (+paper close) → market regime. Each step keeps its own
enable flag and runtime status; a failing step is logged and the pipeline
continues.

The member jobs are registered PAUSED (no auto-fire) so they only run via the
pipeline — but stay manually triggerable from Admin → Jobs (shown as "runs in
daily pipeline"). Alerts (hourly), ticker universe sync, and backtest keep their
own independent cadence.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 10:16:41 +02:00