signal_measurement — the log-to-number facade
Purpose
The one place to turn a logged run into a number. Every function delegates to the same
SignalAnalysis/DataSummarypath the orchestrator sweep uses, so the value you get is by
construction the pipeline’s own definition of the metric — not a second, hand-rolled one. This is the
enforceable core of the measurement contract (.claude/rules/MEASUREMENT.md).
Role in the system
- A thin facade over data_analyzer (math primitives +
DataSummary/SignalAnalysis/LogView) and
statistics (correlation/effect-size/dispersion), imported as_daand_stats. - The shims are deliberately thin: the heavy lifting lives in data_analyzer (the permanent home), the
facade just gives validation scripts a friendly, discoverable entry point. New durable idioms land in
analysis/and get a shim HERE — never the reverse. - Consumed by every validator and diagnostic that needs a metric: the five
validate_*_baseline.py,
run_scoreboard.py,dwell_episodes.py,omega_b_diagnostic.py, the fragility/divergence studies. __all__(validation/signal_measurement.py:47) IS the public surface — bothfrom … import measure
andimport … as smstyles read off it; everything else is implementation detail.
Inputs / Outputs
source— whatever you have: an NPZ path, a liveLogger, or an existingLogView
(_as_viewadapts all three —:73).key— a catalog metric name (z_b,p_e,s_min_J, …); itsmetrics.yamlspec decides the
Delta(e.g. pointing → versine) and units. You do not pass the delta.- windows — optional
after/beforeseconds (t >= after,t <= before);after=90is the
operational window the sweep uses. - Out — floats (
measure), per-step arrays (series,diff_norm),Packagerecords (regress,
episode_contrast,ensemble_dispersion,payload_diff), and index/run sets.
Key functions
measure(source, key, reduction, *, after, before, mask)— canonical scalar reduction; identical to the sweep —:88series(source, key, *, after, before)— the per-step magnitude signal the pipeline plots/reduces —:109as_pointing_deg(versine)— DISPLAY-ONLY versine → angle in degrees —:120view_from_npz(path)— saved NPZ → canonicalLogView(the one-liner) —:68divergence_series/local_growth_rate/first_divergence_index— cross-run separation, growth rate, split onset —:148/:165/:180diff_norm(values, n, prepend)/step_norm(source, key, …)— per-step jump‖diff(x)‖: raw-array and keyed —:192/:197contiguous_runs/merge_runs/drop_short_runs/longest_run_length— episode-run spans/merge/filter/length —:217/:237/:242/:232threshold_crossings/first_crossing/quantized_stall_mask— crossings + dwell-stall mask —:222/:247/:227correlate/regress/lead_lag/effect_size/episode_contrast— keyed two-signal stats + Cliff’s delta —:252/:262/:271/:281/:286ensemble_dispersion/event_effect_size— across-seed dispersion + seeded event-study contrast —:295/:300back_half_pkpk/percentile_of/time_to_extreme/near_singular_fraction— tail amplitude, arbitrary-q, extreme timing, band fraction —:313/:319/:324/:329point_cloud_distance/arclength_projection— spatial: cloud→curve distance / projection —:336/:342available_reductions()— the reduction vocabulary —:347
Footguns
Measure ONLY through this facade — never re-derive a metric
Re-computing “the z_b error” with raw numpy invents a SECOND definition of the number; that is exactly
how a chord got mislabelled as the pointing error (Jun 16). The facade routes through the same reducer
the sweep calls, so its number is the pipeline’s by construction. Parity is pinned to 1e-12 by
validation/tests/test_signal_measurement.py. (.claude/rules/MEASUREMENT.md)
z_b/z_eARE the versine1 − cos θ, not a chord or an angleThe catalog’s pointing-error definition. The chord
‖z_b − z_b_des‖ = 2 sin(θ/2)is ~12× larger and
NOT what the controller regulates;utils.geometry.angle_betweenis a differentDelta(successive
samples). Show degrees withas_pointing_degfor humans, but the stored/reduced metric stays the
versine — and do notrms()the degrees (the nonlinear map does not commute withrms). (MEASUREMENT.md)
The shims delegate — keep the math in
analysis/, not hereEach function is a thin wrapper over
_da/_stats. Adding the implementation here instead of in
data_analyzer inverts the contract and the next refactor loses it. Promote new idioms into
analysis/, shim here, add a parity test. (validation/INSIGHTS.md)
Keyed vs raw correlation
correlatetakes catalog KEYS and pulls theirseries(use it for logged metrics). For two DERIVED raw
arrays (e.g.‖actual omega_b‖vs arm joint-rate) there is no catalog key — call the primitive
pearson_rfrom data_analyzer/statistics directly. (validation/INSIGHTS.md)
Pseudocode (the measure path — why the number is the pipeline’s)
view = windowed(as_view(source), after, before) # NPZ/Logger/LogView → masked LogView
spec = DataSummary().spec_with_overrides(key) # the catalog MetricSpec (z_b → pointing/versine)
signal = SignalAnalysis(view).metric_signal(key) # the SAME per-step signal the sweep builds
return Reductions[reduction].apply(signal) # the SAME reduction the sweep applies
Using it — examples & vocabulary
(The how-to, moved here from MEASUREMENT.md so it loads only when you open this page; the rule keeps just the enforceable three.)
from validation.signal_measurement import measure, series, as_pointing_deg, view_from_npz
measure("logs/.../run.npz", "z_b", "rms", after=90) # op base-pointing versine rms (sweep's 0.0147)
measure(result.logger, "p_e", "p99", after=90) # op EE-tracking p99, no NPZ round-trip
measure(view, "s_min_J", "frac_below", after=90) # near-singular fraction (catalog threshold)
zb = series("logs/.../run.npz", "z_b", after=90)
print(f"median base pointing ≈ {float(as_pointing_deg(zb).mean()):.2f} deg")
# band-restricted: p_e p99 only over steps where s_min_G is in the singularity band
measure(view, "p_e", "p99", after=90, mask={"signal": "s_min_G", "lo": 0.025, "hi": 0.049})Reductions (the reduction arg): rms, p99, p95, median, mean, max, min, final, frac_below, cumulative, count (+ frac_true/count for booleans, ptp peak-to-peak). available_reductions() lists them; the Reductions enum lives in data_analyzer.
Deltas (catalog-chosen — you don’t pass this): POINTING_ERROR (versine, simultaneous), ERROR (Euclidean actual − desired), ANGLE_STEP (angle_between successive samples).
What each metric MEANS (the catalog’s call, not yours):
| key | measures | definition |
|---|---|---|
z_b, z_e | pointing error (versine) | 1 − cos θ, SIMULTANEOUS actual & desired axis |
p_e, p_c | position error | ‖actual − desired‖ |
omega_b, v_c | rate / velocity error | ‖actual − desired‖ |
s_min_J, s_min_G | smallest singular value | logged scalar (catalog frac_below for bands) |
Reusable primitives (fragility / sensitivity / cross-run; all NaN-aware; live in data_analyzer, shimmed here — reach for THESE, never hand-roll):
from validation.signal_measurement import (
divergence_series, first_divergence_index, local_growth_rate, # cross-run divergence
ensemble_dispersion, episode_contrast, correlate, regress, # dispersion / episode / two-signal
back_half_pkpk, near_singular_fraction, # tail amplitude / band fraction
)
t, D = divergence_series(run_a, run_b, "p_c"); lam = local_growth_rate(t, D, smooth=31) # onset + rate
near = series(run, "s_min_J") < 0.005; c = episode_contrast(arm_speed, near) # c.delta = Cliff'sFull roster = the Key functions list above. Domain-specific idioms (mesh coverage, SVD manipulability, schedule oracles) stay with their own code, not the facade — see validation/data_tricks.md.
Equations & references
- The enforceable three rules + figures/
tmp_policy:.claude/rules/MEASUREMENT.md(thin — points back here for the how-to). - Golden parity test (facade == sweep to 1e-12) + value pins:
validation/tests/test_signal_measurement.py. - Reusable-primitive narrative + the “Hand-roll census” ledger:
validation/INSIGHTS.md.
Related
data_analyzer · statistics · metric_catalog · metrics · scoreboard · orchestrator · logger · terminology