parallelize the backtest across worker processes (true multi-core)
The replay was CPU-bound and single-core: the earlier asyncio.to_thread offload kept the API responsive but, because of the GIL, ran on one core. Per-ticker replay is independent, so fan it out across worker processes (which sidestep the GIL) for real multi-core speedup. - New `settings.backtest_workers` (default 4), capped to cpu_count-1 so a core stays free for the web server. - Uses a `forkserver` context (workers forked from a clean single-threaded server — avoids the fork-with-threads deadlock); falls back to `fork`. On spawn-only platforms (Windows) and for 1-ticker runs it uses the thread path, so dev/tests are unaffected. - Worker takes primitive column arrays (cheap to pickle), rebuilds bars, and returns (candidates, plain-dict signal series) — both picklable across the process boundary. Bars are still fetched in the event loop (ORM-safe). - Pool creation is guarded: if the pool can't start, the job falls back to the sequential thread path instead of failing. 334 backend tests pass (parallel path is POSIX/server-only, so it's covered by construction + the picklability/worker-count tests; the thread fallback is exercised by the run_backtest smoke test). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -2,6 +2,8 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import pickle
|
||||
import random
|
||||
from datetime import date, timedelta
|
||||
from types import SimpleNamespace
|
||||
@@ -156,3 +158,38 @@ def test_accumulate_signal_series_emits_weekly_pairs():
|
||||
# ...one per ISO week, with a forward return attached to each pair.
|
||||
sample = next(iter(collected["mom_12_1"].values()))
|
||||
assert all(len(pair) == 2 for pair in sample)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Parallel-replay plumbing (process pool): plain/picklable results, worker count
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_signal_series_is_plain_and_picklable():
|
||||
from collections import defaultdict
|
||||
|
||||
closes = [100.0 * (1.003 ** k) for k in range(400)]
|
||||
series = bt._signal_series(_records(closes))
|
||||
# Must be plain dicts (no defaultdict/lambda) so it survives a process boundary.
|
||||
assert type(series) is dict
|
||||
assert all(type(weeks) is dict for weeks in series.values())
|
||||
pickle.dumps(series) # the worker's return is pickled to the parent — must not raise
|
||||
# ...and equivalent to the in-place accumulator.
|
||||
acc = defaultdict(lambda: defaultdict(list))
|
||||
bt._accumulate_signal_series(_records(closes), acc)
|
||||
assert series == {name: dict(w) for name, w in acc.items()}
|
||||
|
||||
|
||||
def test_worker_count_caps_to_cpu_minus_one(monkeypatch):
|
||||
monkeypatch.setattr(bt.settings, "backtest_workers", 1000)
|
||||
assert bt._backtest_worker_count() == max(1, (os.cpu_count() or 1) - 1)
|
||||
|
||||
|
||||
def test_worker_count_one_disables(monkeypatch):
|
||||
monkeypatch.setattr(bt.settings, "backtest_workers", 1)
|
||||
assert bt._backtest_worker_count() == 1
|
||||
|
||||
|
||||
def test_mp_context_is_none_or_posix():
|
||||
ctx = bt._mp_context()
|
||||
# None on spawn-only platforms (Windows); a safe POSIX context otherwise.
|
||||
assert ctx is None or ctx.get_start_method() in ("fork", "forkserver")
|
||||
|
||||
Reference in New Issue
Block a user