deepen OHLCV history + make the factor-IC pass honest about overlap/regime

Two changes so the cross-sectional signal results can actually be trusted. (a) History depth — the binding constraint. Ingestion defaulted to 365 days, so long-lookback factors (12-month momentum, 52-week high) were only computable on a handful of weeks at the tail, and every IC reflected a single market regime. - New `settings.ohlcv_history_days` (default 1825 ≈ 5y); new tickers backfill this far instead of 1 year. - New manual "data_backfill" job (Admin → Jobs) re-fetches the full window for every ticker, ignoring incremental resume — run once to deepen existing 1-year histories. Idempotent (upsert); resumes after rate limits. (b) Factor-IC honesty. The IC was averaged over weekly rebalances whose 30-day forward windows overlap, inflating the t-stat ~sqrt(6)x. - IC now measured on NON-OVERLAPPING windows (weeks thinned to ~HORIZON apart). - Each signal carries a `reliable` flag (>= 12 independent windows); BacktestPanel greys out and de-stars thin signals so a lucky 9-week IC of 0.3 can't masquerade as an edge. 332 backend tests pass; frontend build clean. No migration (config + job + an added JSON field on the cached backtest report). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 18:20:59 +02:00
parent 402025692a
commit 099846513b
9 changed files with 148 additions and 38 deletions
@@ -277,11 +277,12 @@ export function BacktestPanel() {
                </p>
                <p className="mb-2 text-[11px] text-gray-500">
                  Does ranking the universe by a signal predict the forward {report.params.horizon_days}-day
-                  return? Mean IC is the rank correlation between signal and return, averaged over weekly
-                  rebalances. <span className="text-emerald-400">|IC| ≳ {IC_EDGE_THRESHOLD}</span> with a
+                  return? Mean IC is the rank correlation between signal and return, averaged over
+                  non-overlapping windows. <span className="text-emerald-400">|IC| ≳ {IC_EDGE_THRESHOLD}</span> with a
                  consistent sign (high IC&gt;0 %) is a real, if small, edge; near 0 means it sorts nothing.
                  Momentum skips the last month; <em>reversal_1m is expected negative</em> if the universe
-                  mean-reverts. Q5−Q1 is the top-minus-bottom-quintile forward return.
+                  mean-reverts. Q5−Q1 is the top-minus-bottom-quintile forward return. <span className="text-gray-600">Greyed
+                  rows have too few independent windows to trust — deepen history via the Data Backfill job.</span>
                </p>
                <div className="glass overflow-x-auto">
                  <table className="w-full text-sm">
@@ -298,9 +299,15 @@ export function BacktestPanel() {
                    </thead>
                    <tbody>
                      {report.signal_eval.map((row) => {
-                        const edge = Math.abs(row.mean_ic) >= IC_EDGE_THRESHOLD;
+                        // Only trust the edge highlight when the IC rests on enough
+                        // independent windows; thin signals are dimmed, not starred.
+                        const edge = row.reliable && Math.abs(row.mean_ic) >= IC_EDGE_THRESHOLD;
                        return (
-                          <tr key={row.signal} className={`border-b border-white/[0.04] ${edge ? 'bg-emerald-400/[0.06]' : ''}`}>
+                          <tr
+                            key={row.signal}
+                            className={`border-b border-white/[0.04] ${edge ? 'bg-emerald-400/[0.06]' : ''} ${row.reliable ? '' : 'opacity-40'}`}
+                            title={row.reliable ? undefined : `Only ${row.weeks} independent window(s) — not enough to trust`}
+                          >
                            <td className="px-4 py-2.5 font-medium text-gray-200">
                              {edge && <span className="mr-1 text-emerald-300">★</span>}
                              {SIGNAL_LABELS[row.signal] ?? row.signal}
@@ -232,6 +232,7 @@ export interface BacktestSignalEvalRow {
  ic_t_stat: number | null;
  ic_positive_pct: number;
  mean_quintile_spread: number | null;
+  reliable: boolean;
 }

 export interface BacktestReport {