fix scheduler misfire: daily jobs silently skipped on a busy event loop
AsyncIOScheduler was constructed with no job_defaults, so APScheduler's default misfire_grace_time of 1s applied. In this single-process app the scheduler shares one event loop with the API and all other jobs, so when a daily job came due while the loop was busy (e.g. the scanner mid-run), the fire was processed >1s late, flagged a misfire, and skipped — while next_run still advanced 24h, making the job look healthy though it never ran. Set a generous grace window (1h), coalesce missed runs into a single catch-up, and cap concurrency at 1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
+16
-2
@@ -41,8 +41,22 @@ from app.services.ticker_universe_service import bootstrap_universe
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Module-level scheduler instance
|
||||
scheduler = AsyncIOScheduler()
|
||||
# Module-level scheduler instance.
|
||||
#
|
||||
# job_defaults matter a lot here: this is a single-process app, so the scheduler
|
||||
# shares one event loop with the API and every other job. APScheduler's default
|
||||
# misfire_grace_time is just 1 second — if the loop is busy at the instant a
|
||||
# daily job is due (e.g. the scanner is mid-run), the fire is processed late,
|
||||
# flagged a misfire, and SILENTLY SKIPPED while next_run still advances 24h. So
|
||||
# we grant a generous grace window, coalesce missed runs into one catch-up, and
|
||||
# cap each job at a single concurrent instance.
|
||||
scheduler = AsyncIOScheduler(
|
||||
job_defaults={
|
||||
"coalesce": True,
|
||||
"max_instances": 1,
|
||||
"misfire_grace_time": 3600, # tolerate a busy loop; a daily job up to 1h late is fine
|
||||
}
|
||||
)
|
||||
|
||||
# Track last successful ticker per job for rate-limit resume
|
||||
_last_successful: dict[str, str | None] = {
|
||||
|
||||
Reference in New Issue
Block a user