Files
signal-platform/.kiro/specs/rr-scanner-target-quality/design.md
Dennis Thiessen 181cfe6588
Some checks failed
Deploy / lint (push) Failing after 8s
Deploy / test (push) Has been skipped
Deploy / deploy (push) Has been skipped
major update
2026-02-27 16:08:09 +01:00

210 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# R:R Scanner Target Quality Bugfix Design
## Overview
The `scan_ticker` function in `app/services/rr_scanner_service.py` selects trade setup targets by iterating candidate S/R levels and picking the one with the highest R:R ratio. Because risk is fixed (ATR × multiplier), R:R is a monotonically increasing function of distance from entry price. This means the scanner always selects the most distant S/R level, producing unrealistic trade setups.
The fix replaces the `max(rr)` selection with a quality score that balances three factors: R:R ratio, S/R level strength (0100), and proximity to current price. The quality score is computed as a weighted sum of normalized components, and the candidate with the highest quality score is selected as the target.
## Glossary
- **Bug_Condition (C)**: Multiple candidate S/R levels exist in the target direction, and the current code selects the most distant one purely because it has the highest R:R ratio, ignoring strength and proximity
- **Property (P)**: The scanner should select the candidate with the highest quality score (a weighted combination of R:R ratio, strength, and proximity) rather than the highest raw R:R ratio
- **Preservation**: All behavior for single-candidate scenarios, no-candidate scenarios, R:R threshold filtering, database persistence, and `get_trade_setups` sorting must remain unchanged
- **scan_ticker**: The function in `app/services/rr_scanner_service.py` that scans a single ticker for long and short trade setups
- **SRLevel.strength**: An integer 0100 representing how many times price has touched this level relative to total bars (computed by `sr_service._strength_from_touches`)
- **quality_score**: New scoring metric: `w_rr * norm_rr + w_strength * norm_strength + w_proximity * norm_proximity`
## Bug Details
### Fault Condition
The bug manifests when multiple S/R levels exist in the target direction (above entry for longs, below entry for shorts) and the scanner selects the most distant level because it has the highest R:R ratio, even though a closer, stronger level would be a more realistic target.
**Formal Specification:**
```
FUNCTION isBugCondition(input)
INPUT: input of type {entry_price, risk, candidate_levels: list[{price_level, strength}]}
OUTPUT: boolean
candidates := [lv for lv in candidate_levels where reward(lv) / risk >= rr_threshold]
IF len(candidates) < 2 THEN RETURN false
max_rr_level := argmax(candidates, key=lambda lv: reward(lv) / risk)
max_quality_level := argmax(candidates, key=lambda lv: quality_score(lv, entry_price, risk))
RETURN max_rr_level != max_quality_level
END FUNCTION
```
### Examples
- **Long, 2 resistance levels**: Entry=100, ATR-stop=97 (risk=3). Level A: price=103, strength=80 (R:R=1.0). Level B: price=115, strength=10 (R:R=5.0). Current code picks B (highest R:R). Expected: picks A (strong, nearby, realistic).
- **Long, 3 resistance levels**: Entry=50, risk=2. Level A: price=53, strength=90 (R:R=1.5). Level B: price=58, strength=40 (R:R=4.0). Level C: price=70, strength=5 (R:R=10.0). Current code picks C. Expected: picks A or B depending on quality weights.
- **Short, 2 support levels**: Entry=200, risk=5. Level A: price=192, strength=70 (R:R=1.6). Level B: price=170, strength=15 (R:R=6.0). Current code picks B. Expected: picks A.
- **Single candidate (no bug)**: Entry=100, risk=3. Only Level A: price=106, strength=50 (R:R=2.0). Both old and new code select A — no divergence.
## Expected Behavior
### Preservation Requirements
**Unchanged Behaviors:**
- When no S/R levels exist in the target direction, no setup is produced for that direction
- When no candidate level meets the R:R threshold, no setup is produced
- When only one S/R level exists in the target direction, it is evaluated against the R:R threshold and used if it qualifies
- `scan_all_tickers` processes each ticker independently; one failure does not stop others
- `get_trade_setups` returns results sorted by R:R ratio descending with composite score as secondary sort
- Database persistence: old setups are deleted and new ones inserted per ticker
- ATR computation, OHLCV fetching, and stop-loss calculation remain unchanged
- The TradeSetup model fields and their rounding (4 decimal places) remain unchanged
**Scope:**
All inputs where only zero or one candidate S/R levels exist in the target direction are completely unaffected by this fix. The fix only changes the selection logic when multiple qualifying candidates exist.
## Hypothesized Root Cause
Based on the bug description, the root cause is straightforward:
1. **Selection by max R:R only**: The inner loop in `scan_ticker` tracks `best_rr` and `best_target`, selecting whichever level produces the highest `rr = reward / risk`. Since `risk` is constant (ATR-based), `rr` is proportional to distance. The code has no mechanism to factor in `SRLevel.strength` or proximity.
2. **No quality scoring exists**: The `SRLevel.strength` field (0100) is available in the database and loaded by the query, but the selection loop never reads it. There is no quality score computation anywhere in the codebase.
3. **No proximity normalization**: Distance from entry is used only to compute reward, never as a penalty. Closer levels are always disadvantaged.
## Correctness Properties
Property 1: Fault Condition - Quality Score Selection Replaces Max R:R
_For any_ input where multiple candidate S/R levels exist in the target direction and meet the R:R threshold, the fixed `scan_ticker` function SHALL select the candidate with the highest quality score (weighted combination of normalized R:R, normalized strength, and normalized proximity) rather than the candidate with the highest raw R:R ratio.
**Validates: Requirements 2.1, 2.2, 2.3, 2.4**
Property 2: Preservation - Single/Zero Candidate Behavior Unchanged
_For any_ input where zero or one candidate S/R levels exist in the target direction, the fixed `scan_ticker` function SHALL produce the same result as the original function, preserving the existing filtering, persistence, and output behavior.
**Validates: Requirements 3.1, 3.2, 3.3, 3.4, 3.5**
## Fix Implementation
### Changes Required
Assuming our root cause analysis is correct:
**File**: `app/services/rr_scanner_service.py`
**Function**: `scan_ticker`
**Specific Changes**:
1. **Add `_compute_quality_score` helper function**: A new module-level function that computes the quality score for a candidate S/R level given entry price, risk, and configurable weights.
```python
def _compute_quality_score(
rr: float,
strength: int,
distance: float,
entry_price: float,
*,
w_rr: float = 0.35,
w_strength: float = 0.35,
w_proximity: float = 0.30,
rr_cap: float = 10.0,
) -> float:
norm_rr = min(rr / rr_cap, 1.0)
norm_strength = strength / 100.0
norm_proximity = 1.0 - min(distance / entry_price, 1.0)
return w_rr * norm_rr + w_strength * norm_strength + w_proximity * norm_proximity
```
- `norm_rr`: R:R capped at `rr_cap` (default 10) and divided to get 01 range
- `norm_strength`: Strength divided by 100 (already 0100 integer)
- `norm_proximity`: `1 - (distance / entry_price)`, so closer levels score higher
- Default weights: 0.35 R:R, 0.35 strength, 0.30 proximity (sum = 1.0)
2. **Replace long setup selection loop**: Instead of tracking `best_rr` / `best_target`, iterate candidates, compute quality score for each, and track `best_quality` / `best_candidate`. Still filter by `rr >= rr_threshold` before scoring. Store the selected level's R:R in the TradeSetup (not the quality score — R:R remains the reported metric).
3. **Replace short setup selection loop**: Same change as longs but for levels below entry.
4. **Pass `SRLevel` object through selection**: The loop already has access to `lv.strength` from the query. No additional DB queries needed.
5. **No changes to `get_trade_setups`**: Sorting by `rr_ratio` descending remains. The `rr_ratio` stored in TradeSetup is the actual R:R of the selected level, not the quality score.
## Testing Strategy
### Validation Approach
The testing strategy follows a two-phase approach: first, surface counterexamples that demonstrate the bug on unfixed code, then verify the fix works correctly and preserves existing behavior.
### Exploratory Fault Condition Checking
**Goal**: Surface counterexamples that demonstrate the bug BEFORE implementing the fix. Confirm or refute the root cause analysis. If we refute, we will need to re-hypothesize.
**Test Plan**: Create mock scenarios with multiple S/R levels of varying strength and distance. Run `scan_ticker` on unfixed code and assert that the selected target is NOT the most distant level. These tests will fail on unfixed code, confirming the bug.
**Test Cases**:
1. **Long with strong-near vs weak-far**: Entry=100, risk=3. Near level (103, strength=80) vs far level (115, strength=10). Assert selected target != 115 (will fail on unfixed code)
2. **Short with strong-near vs weak-far**: Entry=200, risk=5. Near level (192, strength=70) vs far level (170, strength=15). Assert selected target != 170 (will fail on unfixed code)
3. **Three candidates with varying profiles**: Entry=50, risk=2. Three levels at different distances/strengths. Assert selection is not purely distance-based (will fail on unfixed code)
**Expected Counterexamples**:
- The unfixed code always selects the most distant level regardless of strength
- Root cause confirmed: selection loop only tracks `best_rr` which is proportional to distance
### Fix Checking
**Goal**: Verify that for all inputs where the bug condition holds, the fixed function produces the expected behavior.
**Pseudocode:**
```
FOR ALL input WHERE isBugCondition(input) DO
result := scan_ticker_fixed(input)
selected_level := result.target
ASSERT selected_level == argmax(candidates, key=quality_score)
ASSERT quality_score(selected_level) >= quality_score(any_other_candidate)
END FOR
```
### Preservation Checking
**Goal**: Verify that for all inputs where the bug condition does NOT hold, the fixed function produces the same result as the original function.
**Pseudocode:**
```
FOR ALL input WHERE NOT isBugCondition(input) DO
ASSERT scan_ticker_original(input) == scan_ticker_fixed(input)
END FOR
```
**Testing Approach**: Property-based testing is recommended for preservation checking because:
- It generates many test cases automatically across the input domain
- It catches edge cases that manual unit tests might miss
- It provides strong guarantees that behavior is unchanged for all non-buggy inputs
**Test Plan**: Observe behavior on UNFIXED code first for zero-candidate and single-candidate scenarios, then write property-based tests capturing that behavior.
**Test Cases**:
1. **Zero candidates preservation**: Generate random tickers with no S/R levels in target direction. Verify no setup is produced (same as original).
2. **Single candidate preservation**: Generate random tickers with exactly one qualifying S/R level. Verify same setup is produced as original.
3. **Below-threshold preservation**: Generate random tickers where all candidates have R:R below threshold. Verify no setup is produced.
4. **Database persistence preservation**: Verify old setups are deleted and new ones inserted identically.
### Unit Tests
- Test `_compute_quality_score` with known inputs and verify output matches expected formula
- Test that quality score components are properly normalized to 01 range
- Test that `rr_cap` correctly caps the R:R normalization
- Test edge cases: strength=0, strength=100, distance=0, single candidate
### Property-Based Tests
- Generate random sets of S/R levels with varying strengths and distances; verify the selected target always has the highest quality score among candidates
- Generate random single-candidate scenarios; verify output matches what the original function would produce
- Generate random inputs with all candidates below R:R threshold; verify no setup is produced
### Integration Tests
- Test full `scan_ticker` flow with mocked DB containing multiple S/R levels of varying quality
- Test `scan_all_tickers` still processes each ticker independently
- Test that `get_trade_setups` returns correct sorting after fix