determinism — BLAS/OpenMP thread pinning
Purpose
Pin BLAS/OpenMP to a single thread so reductions sum in a fixed order — the precondition for any
bit-exact reproducibility or sensitivity claim in this chaotically-sensitive closed loop.
Role in the system
- Stateless support utility; imported once, early, before
numpy/BLAS loads (the env-var path only
bites a fresh process). Call it at the top of a run entry point (runner,validation/run_mission.py). - Underpins every reproduction guarantee the rest of the codebase relies on: matched-config A/B
(see breve_controller refactor harnesses), cross-machine parity, the fragility/sensitivity studies.
Inputs / Outputs
- In:
n— target thread count (default1). - Out: returns
n; side effects are the real product — sets five*_NUM_THREADSenv vars and
installs a process-lifetimethreadpoolctllimit (held alive by the module-global_limiter).
Key functions
pin_threads(n=1)— set the thread env vars and thethreadpoolctllimit; idempotent —utils/determinism.py:18
Footguns
Single-thread is not optional — it fixes the reduction order
Multi-threaded BLAS sums reductions in nondeterministic order; last-bit drift is enough to change
the macroscopic trajectory in this chaotically-sensitive closed loop. Single-thread fixes the order
and is the precondition for any reproducibility claim. (utils/INSIGHTS.md[perf])
Two paths, two timings — set env vars BEFORE numpy loads
pin_threadsworks on two fronts: (1) the env vars (OMP_/OPENBLAS_/MKL_/VECLIB_/NUMEXPR_NUM_THREADS)
only take effect for a process that loads BLAS after they are set — so import this first; (2)
threadpoolctladditionally pins pools already loaded in the current process. The module-global
_limitermust stay referenced or the limit is dropped. (utils/INSIGHTS.md[convention])
Pseudocode
pin_threads(n=1):
for var in (OMP, OPENBLAS, MKL, VECLIB, NUMEXPR)_NUM_THREADS:
os.environ[var] = str(n) # effective only for a fresh process
_limiter = threadpoolctl.threadpool_limits(n) # pins already-loaded pools; held process-wide
return n