major update
This commit is contained in:
1
.kiro/specs/rr-scanner-target-quality/.config.kiro
Normal file
1
.kiro/specs/rr-scanner-target-quality/.config.kiro
Normal file
@@ -0,0 +1 @@
|
||||
{"specId": "997fa90b-08bc-4b72-b099-ecc0ad611b06", "workflowType": "requirements-first", "specType": "bugfix"}
|
||||
39
.kiro/specs/rr-scanner-target-quality/bugfix.md
Normal file
39
.kiro/specs/rr-scanner-target-quality/bugfix.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Bugfix Requirements Document
|
||||
|
||||
## Introduction
|
||||
|
||||
The R:R scanner's `scan_ticker` function selects trade setup targets by picking whichever S/R level yields the highest R:R ratio. Because R:R = reward / risk and risk is fixed (ATR-based stop), this always favors the most distant S/R level. The result is unrealistic trade setups targeting far-away levels that price is unlikely to reach. The scanner should instead select the highest-quality target by balancing R:R ratio with level strength and proximity to current price.
|
||||
|
||||
## Bug Analysis
|
||||
|
||||
### Current Behavior (Defect)
|
||||
|
||||
1.1 WHEN scanning for long setups THEN the system iterates all resistance levels above entry price and selects the one with the maximum R:R ratio, which is always the most distant level since risk is fixed
|
||||
|
||||
1.2 WHEN scanning for short setups THEN the system iterates all support levels below entry price and selects the one with the maximum R:R ratio, which is always the most distant level since risk is fixed
|
||||
|
||||
1.3 WHEN multiple S/R levels exist at varying distances with different strength values THEN the system ignores the `strength` field entirely and selects based solely on R:R magnitude
|
||||
|
||||
1.4 WHEN a weak, distant S/R level exists alongside a strong, nearby S/R level THEN the system selects the weak distant level because it produces a higher R:R ratio, resulting in an unrealistic trade setup
|
||||
|
||||
### Expected Behavior (Correct)
|
||||
|
||||
2.1 WHEN scanning for long setups THEN the system SHALL compute a quality score for each candidate resistance level that factors in R:R ratio, S/R level strength, and proximity to entry price, and select the level with the highest quality score
|
||||
|
||||
2.2 WHEN scanning for short setups THEN the system SHALL compute a quality score for each candidate support level that factors in R:R ratio, S/R level strength, and proximity to entry price, and select the level with the highest quality score
|
||||
|
||||
2.3 WHEN multiple S/R levels exist at varying distances with different strength values THEN the system SHALL weight stronger levels higher in the quality score, favoring targets that price is more likely to reach
|
||||
|
||||
2.4 WHEN a weak, distant S/R level exists alongside a strong, nearby S/R level THEN the system SHALL prefer the strong nearby level unless the distant level's combined quality score (considering its lower proximity and strength factors) still exceeds the nearby level's score
|
||||
|
||||
### Unchanged Behavior (Regression Prevention)
|
||||
|
||||
3.1 WHEN no S/R levels exist above entry price for longs (or below for shorts) THEN the system SHALL CONTINUE TO produce no setup for that direction
|
||||
|
||||
3.2 WHEN no candidate level meets the R:R threshold THEN the system SHALL CONTINUE TO produce no setup for that direction
|
||||
|
||||
3.3 WHEN only one S/R level exists in the target direction THEN the system SHALL CONTINUE TO evaluate it against the R:R threshold and produce a setup if it qualifies
|
||||
|
||||
3.4 WHEN scanning all tickers THEN the system SHALL CONTINUE TO process each ticker independently and persist results to the database
|
||||
|
||||
3.5 WHEN fetching stored trade setups THEN the system SHALL CONTINUE TO return them sorted by R:R ratio descending with composite score as secondary sort
|
||||
209
.kiro/specs/rr-scanner-target-quality/design.md
Normal file
209
.kiro/specs/rr-scanner-target-quality/design.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# R:R Scanner Target Quality Bugfix Design
|
||||
|
||||
## Overview
|
||||
|
||||
The `scan_ticker` function in `app/services/rr_scanner_service.py` selects trade setup targets by iterating candidate S/R levels and picking the one with the highest R:R ratio. Because risk is fixed (ATR × multiplier), R:R is a monotonically increasing function of distance from entry price. This means the scanner always selects the most distant S/R level, producing unrealistic trade setups.
|
||||
|
||||
The fix replaces the `max(rr)` selection with a quality score that balances three factors: R:R ratio, S/R level strength (0–100), and proximity to current price. The quality score is computed as a weighted sum of normalized components, and the candidate with the highest quality score is selected as the target.
|
||||
|
||||
## Glossary
|
||||
|
||||
- **Bug_Condition (C)**: Multiple candidate S/R levels exist in the target direction, and the current code selects the most distant one purely because it has the highest R:R ratio, ignoring strength and proximity
|
||||
- **Property (P)**: The scanner should select the candidate with the highest quality score (a weighted combination of R:R ratio, strength, and proximity) rather than the highest raw R:R ratio
|
||||
- **Preservation**: All behavior for single-candidate scenarios, no-candidate scenarios, R:R threshold filtering, database persistence, and `get_trade_setups` sorting must remain unchanged
|
||||
- **scan_ticker**: The function in `app/services/rr_scanner_service.py` that scans a single ticker for long and short trade setups
|
||||
- **SRLevel.strength**: An integer 0–100 representing how many times price has touched this level relative to total bars (computed by `sr_service._strength_from_touches`)
|
||||
- **quality_score**: New scoring metric: `w_rr * norm_rr + w_strength * norm_strength + w_proximity * norm_proximity`
|
||||
|
||||
## Bug Details
|
||||
|
||||
### Fault Condition
|
||||
|
||||
The bug manifests when multiple S/R levels exist in the target direction (above entry for longs, below entry for shorts) and the scanner selects the most distant level because it has the highest R:R ratio, even though a closer, stronger level would be a more realistic target.
|
||||
|
||||
**Formal Specification:**
|
||||
```
|
||||
FUNCTION isBugCondition(input)
|
||||
INPUT: input of type {entry_price, risk, candidate_levels: list[{price_level, strength}]}
|
||||
OUTPUT: boolean
|
||||
|
||||
candidates := [lv for lv in candidate_levels where reward(lv) / risk >= rr_threshold]
|
||||
IF len(candidates) < 2 THEN RETURN false
|
||||
|
||||
max_rr_level := argmax(candidates, key=lambda lv: reward(lv) / risk)
|
||||
max_quality_level := argmax(candidates, key=lambda lv: quality_score(lv, entry_price, risk))
|
||||
|
||||
RETURN max_rr_level != max_quality_level
|
||||
END FUNCTION
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
- **Long, 2 resistance levels**: Entry=100, ATR-stop=97 (risk=3). Level A: price=103, strength=80 (R:R=1.0). Level B: price=115, strength=10 (R:R=5.0). Current code picks B (highest R:R). Expected: picks A (strong, nearby, realistic).
|
||||
- **Long, 3 resistance levels**: Entry=50, risk=2. Level A: price=53, strength=90 (R:R=1.5). Level B: price=58, strength=40 (R:R=4.0). Level C: price=70, strength=5 (R:R=10.0). Current code picks C. Expected: picks A or B depending on quality weights.
|
||||
- **Short, 2 support levels**: Entry=200, risk=5. Level A: price=192, strength=70 (R:R=1.6). Level B: price=170, strength=15 (R:R=6.0). Current code picks B. Expected: picks A.
|
||||
- **Single candidate (no bug)**: Entry=100, risk=3. Only Level A: price=106, strength=50 (R:R=2.0). Both old and new code select A — no divergence.
|
||||
|
||||
## Expected Behavior
|
||||
|
||||
### Preservation Requirements
|
||||
|
||||
**Unchanged Behaviors:**
|
||||
- When no S/R levels exist in the target direction, no setup is produced for that direction
|
||||
- When no candidate level meets the R:R threshold, no setup is produced
|
||||
- When only one S/R level exists in the target direction, it is evaluated against the R:R threshold and used if it qualifies
|
||||
- `scan_all_tickers` processes each ticker independently; one failure does not stop others
|
||||
- `get_trade_setups` returns results sorted by R:R ratio descending with composite score as secondary sort
|
||||
- Database persistence: old setups are deleted and new ones inserted per ticker
|
||||
- ATR computation, OHLCV fetching, and stop-loss calculation remain unchanged
|
||||
- The TradeSetup model fields and their rounding (4 decimal places) remain unchanged
|
||||
|
||||
**Scope:**
|
||||
All inputs where only zero or one candidate S/R levels exist in the target direction are completely unaffected by this fix. The fix only changes the selection logic when multiple qualifying candidates exist.
|
||||
|
||||
## Hypothesized Root Cause
|
||||
|
||||
Based on the bug description, the root cause is straightforward:
|
||||
|
||||
1. **Selection by max R:R only**: The inner loop in `scan_ticker` tracks `best_rr` and `best_target`, selecting whichever level produces the highest `rr = reward / risk`. Since `risk` is constant (ATR-based), `rr` is proportional to distance. The code has no mechanism to factor in `SRLevel.strength` or proximity.
|
||||
|
||||
2. **No quality scoring exists**: The `SRLevel.strength` field (0–100) is available in the database and loaded by the query, but the selection loop never reads it. There is no quality score computation anywhere in the codebase.
|
||||
|
||||
3. **No proximity normalization**: Distance from entry is used only to compute reward, never as a penalty. Closer levels are always disadvantaged.
|
||||
|
||||
## Correctness Properties
|
||||
|
||||
Property 1: Fault Condition - Quality Score Selection Replaces Max R:R
|
||||
|
||||
_For any_ input where multiple candidate S/R levels exist in the target direction and meet the R:R threshold, the fixed `scan_ticker` function SHALL select the candidate with the highest quality score (weighted combination of normalized R:R, normalized strength, and normalized proximity) rather than the candidate with the highest raw R:R ratio.
|
||||
|
||||
**Validates: Requirements 2.1, 2.2, 2.3, 2.4**
|
||||
|
||||
Property 2: Preservation - Single/Zero Candidate Behavior Unchanged
|
||||
|
||||
_For any_ input where zero or one candidate S/R levels exist in the target direction, the fixed `scan_ticker` function SHALL produce the same result as the original function, preserving the existing filtering, persistence, and output behavior.
|
||||
|
||||
**Validates: Requirements 3.1, 3.2, 3.3, 3.4, 3.5**
|
||||
|
||||
## Fix Implementation
|
||||
|
||||
### Changes Required
|
||||
|
||||
Assuming our root cause analysis is correct:
|
||||
|
||||
**File**: `app/services/rr_scanner_service.py`
|
||||
|
||||
**Function**: `scan_ticker`
|
||||
|
||||
**Specific Changes**:
|
||||
|
||||
1. **Add `_compute_quality_score` helper function**: A new module-level function that computes the quality score for a candidate S/R level given entry price, risk, and configurable weights.
|
||||
|
||||
```python
|
||||
def _compute_quality_score(
|
||||
rr: float,
|
||||
strength: int,
|
||||
distance: float,
|
||||
entry_price: float,
|
||||
*,
|
||||
w_rr: float = 0.35,
|
||||
w_strength: float = 0.35,
|
||||
w_proximity: float = 0.30,
|
||||
rr_cap: float = 10.0,
|
||||
) -> float:
|
||||
norm_rr = min(rr / rr_cap, 1.0)
|
||||
norm_strength = strength / 100.0
|
||||
norm_proximity = 1.0 - min(distance / entry_price, 1.0)
|
||||
return w_rr * norm_rr + w_strength * norm_strength + w_proximity * norm_proximity
|
||||
```
|
||||
|
||||
- `norm_rr`: R:R capped at `rr_cap` (default 10) and divided to get 0–1 range
|
||||
- `norm_strength`: Strength divided by 100 (already 0–100 integer)
|
||||
- `norm_proximity`: `1 - (distance / entry_price)`, so closer levels score higher
|
||||
- Default weights: 0.35 R:R, 0.35 strength, 0.30 proximity (sum = 1.0)
|
||||
|
||||
2. **Replace long setup selection loop**: Instead of tracking `best_rr` / `best_target`, iterate candidates, compute quality score for each, and track `best_quality` / `best_candidate`. Still filter by `rr >= rr_threshold` before scoring. Store the selected level's R:R in the TradeSetup (not the quality score — R:R remains the reported metric).
|
||||
|
||||
3. **Replace short setup selection loop**: Same change as longs but for levels below entry.
|
||||
|
||||
4. **Pass `SRLevel` object through selection**: The loop already has access to `lv.strength` from the query. No additional DB queries needed.
|
||||
|
||||
5. **No changes to `get_trade_setups`**: Sorting by `rr_ratio` descending remains. The `rr_ratio` stored in TradeSetup is the actual R:R of the selected level, not the quality score.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Validation Approach
|
||||
|
||||
The testing strategy follows a two-phase approach: first, surface counterexamples that demonstrate the bug on unfixed code, then verify the fix works correctly and preserves existing behavior.
|
||||
|
||||
### Exploratory Fault Condition Checking
|
||||
|
||||
**Goal**: Surface counterexamples that demonstrate the bug BEFORE implementing the fix. Confirm or refute the root cause analysis. If we refute, we will need to re-hypothesize.
|
||||
|
||||
**Test Plan**: Create mock scenarios with multiple S/R levels of varying strength and distance. Run `scan_ticker` on unfixed code and assert that the selected target is NOT the most distant level. These tests will fail on unfixed code, confirming the bug.
|
||||
|
||||
**Test Cases**:
|
||||
1. **Long with strong-near vs weak-far**: Entry=100, risk=3. Near level (103, strength=80) vs far level (115, strength=10). Assert selected target != 115 (will fail on unfixed code)
|
||||
2. **Short with strong-near vs weak-far**: Entry=200, risk=5. Near level (192, strength=70) vs far level (170, strength=15). Assert selected target != 170 (will fail on unfixed code)
|
||||
3. **Three candidates with varying profiles**: Entry=50, risk=2. Three levels at different distances/strengths. Assert selection is not purely distance-based (will fail on unfixed code)
|
||||
|
||||
**Expected Counterexamples**:
|
||||
- The unfixed code always selects the most distant level regardless of strength
|
||||
- Root cause confirmed: selection loop only tracks `best_rr` which is proportional to distance
|
||||
|
||||
### Fix Checking
|
||||
|
||||
**Goal**: Verify that for all inputs where the bug condition holds, the fixed function produces the expected behavior.
|
||||
|
||||
**Pseudocode:**
|
||||
```
|
||||
FOR ALL input WHERE isBugCondition(input) DO
|
||||
result := scan_ticker_fixed(input)
|
||||
selected_level := result.target
|
||||
ASSERT selected_level == argmax(candidates, key=quality_score)
|
||||
ASSERT quality_score(selected_level) >= quality_score(any_other_candidate)
|
||||
END FOR
|
||||
```
|
||||
|
||||
### Preservation Checking
|
||||
|
||||
**Goal**: Verify that for all inputs where the bug condition does NOT hold, the fixed function produces the same result as the original function.
|
||||
|
||||
**Pseudocode:**
|
||||
```
|
||||
FOR ALL input WHERE NOT isBugCondition(input) DO
|
||||
ASSERT scan_ticker_original(input) == scan_ticker_fixed(input)
|
||||
END FOR
|
||||
```
|
||||
|
||||
**Testing Approach**: Property-based testing is recommended for preservation checking because:
|
||||
- It generates many test cases automatically across the input domain
|
||||
- It catches edge cases that manual unit tests might miss
|
||||
- It provides strong guarantees that behavior is unchanged for all non-buggy inputs
|
||||
|
||||
**Test Plan**: Observe behavior on UNFIXED code first for zero-candidate and single-candidate scenarios, then write property-based tests capturing that behavior.
|
||||
|
||||
**Test Cases**:
|
||||
1. **Zero candidates preservation**: Generate random tickers with no S/R levels in target direction. Verify no setup is produced (same as original).
|
||||
2. **Single candidate preservation**: Generate random tickers with exactly one qualifying S/R level. Verify same setup is produced as original.
|
||||
3. **Below-threshold preservation**: Generate random tickers where all candidates have R:R below threshold. Verify no setup is produced.
|
||||
4. **Database persistence preservation**: Verify old setups are deleted and new ones inserted identically.
|
||||
|
||||
### Unit Tests
|
||||
|
||||
- Test `_compute_quality_score` with known inputs and verify output matches expected formula
|
||||
- Test that quality score components are properly normalized to 0–1 range
|
||||
- Test that `rr_cap` correctly caps the R:R normalization
|
||||
- Test edge cases: strength=0, strength=100, distance=0, single candidate
|
||||
|
||||
### Property-Based Tests
|
||||
|
||||
- Generate random sets of S/R levels with varying strengths and distances; verify the selected target always has the highest quality score among candidates
|
||||
- Generate random single-candidate scenarios; verify output matches what the original function would produce
|
||||
- Generate random inputs with all candidates below R:R threshold; verify no setup is produced
|
||||
|
||||
### Integration Tests
|
||||
|
||||
- Test full `scan_ticker` flow with mocked DB containing multiple S/R levels of varying quality
|
||||
- Test `scan_all_tickers` still processes each ticker independently
|
||||
- Test that `get_trade_setups` returns correct sorting after fix
|
||||
35
.kiro/specs/rr-scanner-target-quality/tasks.md
Normal file
35
.kiro/specs/rr-scanner-target-quality/tasks.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Tasks
|
||||
|
||||
## 1. Add quality score helper function
|
||||
- [x] 1.1 Create `_compute_quality_score(rr, strength, distance, entry_price, *, w_rr=0.35, w_strength=0.35, w_proximity=0.30, rr_cap=10.0) -> float` function in `app/services/rr_scanner_service.py` that computes a weighted sum of normalized R:R, normalized strength, and normalized proximity
|
||||
- [x] 1.2 Implement normalization: `norm_rr = min(rr / rr_cap, 1.0)`, `norm_strength = strength / 100.0`, `norm_proximity = 1.0 - min(distance / entry_price, 1.0)`
|
||||
- [x] 1.3 Return `w_rr * norm_rr + w_strength * norm_strength + w_proximity * norm_proximity`
|
||||
|
||||
## 2. Replace long setup selection logic
|
||||
- [x] 2.1 In `scan_ticker`, replace the long setup loop that tracks `best_rr` / `best_target` with a loop that computes `quality_score` for each candidate via `_compute_quality_score` and tracks `best_quality` / `best_candidate_rr` / `best_candidate_target`
|
||||
- [x] 2.2 Keep the `rr >= rr_threshold` filter — only candidates meeting the threshold are scored
|
||||
- [x] 2.3 Store the selected candidate's actual R:R ratio (not the quality score) in `TradeSetup.rr_ratio`
|
||||
|
||||
## 3. Replace short setup selection logic
|
||||
- [x] 3.1 Apply the same quality-score selection change to the short setup loop, mirroring the long setup changes
|
||||
- [x] 3.2 Ensure distance is computed as `entry_price - lv.price_level` for short candidates
|
||||
|
||||
## 4. Write unit tests for `_compute_quality_score`
|
||||
- [x] 4.1 Create `tests/unit/test_rr_scanner_quality_score.py` with tests for known inputs verifying the formula output
|
||||
- [x] 4.2 Test edge cases: strength=0, strength=100, distance=0, rr at cap, rr above cap
|
||||
- [x] 4.3 Test that all normalized components stay in 0–1 range
|
||||
|
||||
## 5. Write exploratory bug-condition tests (run on unfixed code to confirm bug)
|
||||
- [x] 5.1 [PBT-exploration] Create `tests/unit/test_rr_scanner_bug_exploration.py` with a property test that generates multiple S/R levels with varying strengths and distances, calls `scan_ticker`, and asserts the selected target is NOT always the most distant level — expected to FAIL on unfixed code, confirming the bug
|
||||
|
||||
## 6. Write fix-checking tests
|
||||
- [x] 6.1 [PBT-fix] Create `tests/unit/test_rr_scanner_fix_check.py` with a property test that generates multiple candidate S/R levels meeting the R:R threshold, calls `scan_ticker` on fixed code, and asserts the selected target has the highest quality score among all candidates
|
||||
|
||||
## 7. Write preservation tests
|
||||
- [x] 7.1 [PBT-preservation] Create `tests/unit/test_rr_scanner_preservation.py` with a property test that generates zero-candidate and single-candidate scenarios and asserts the fixed function produces the same output as the original (no setup for zero candidates, same setup for single candidate)
|
||||
- [x] 7.2 Add unit test verifying that when no S/R levels exist, no setup is produced (unchanged)
|
||||
- [x] 7.3 Add unit test verifying that when only one candidate meets threshold, it is selected (unchanged)
|
||||
- [x] 7.4 Add unit test verifying `get_trade_setups` sorting is unchanged (R:R desc, composite desc)
|
||||
|
||||
## 8. Integration test
|
||||
- [x] 8.1 Add integration test in `tests/unit/test_rr_scanner_integration.py` that mocks DB with multiple S/R levels of varying quality, runs `scan_ticker`, and verifies the full flow: quality-based selection, correct TradeSetup fields, database persistence
|
||||
Reference in New Issue
Block a user