data_analyzer — signal math for the analysis pipeline
Purpose
The math library downstream of the NPZ handoff:
LogView(masked log access),SignalAnalysis
(per-key metric evaluation + plot-axis building),DataSummary(YAML-driven reductions + comparison
tables), theReductions/Deltasenums, and a reusable data-primitives block.
It is the one definition of every metric number a figure, table, or scoreboard gate reports.
Role in the system
- Consumed by orchestrator (the only module that reaches it) → feeds plotter (axes) and
star_reporter (summary/comparison frames). Modules never cross boundaries: this one reduces, it
does not plot or render. - Reads saved logs produced by logger (
LogStore→ NPZ);LogViewwraps the loadedLogEntrytree. - Specs come from the catalog layer metric_catalog (re-exported here for back-compat); the inference
layer statistics (correlation/regression/dispersion) is also re-exported from here. - The measurement facade
validation/signal_measurement.pyshims only to functions here — that is
what makes its number the pipeline’s number by construction (see terminology, the MEASUREMENT contract). - Pointing error is the versine
1 − cos θ(Deltas.POINTING_ERROR), and it lives here, not in GNC.
Inputs / Outputs
- In: a saved log (NPZ path /
Logger/LogView), the metric catalog, and run-spec analysis options
(delta specs, band masks,key_variables). - Out: per-step signals (
np.ndarray), scalar reductions (float), and presentationDataFrames
(summary tables, comparison tables, peak-alignment frames) plus preparedPlotAxisobjects.
Key methods / functions
LogView.values/values_at— masked values by leaf name / dotted path —analysis/data_analyzer.py:848/:844LogView.with_mask/include_between— compose a step mask / time window —:741/:763SignalAnalysis.metric_values— a key’s per-step values, collapsing paired → error signal —:1032SignalAnalysis.signal_from_values— norm-or-component, abs unless the reduction is signed —:1057SignalAnalysis.window_reduction— the scalar a sweep/gate reports —:1086SignalAnalysis.plot_axis— build a panel’sPlotAxis(lines/quivers/hlines/mode-shading) —:1428Reductions.apply/Deltas.apply— the reduction & delta math —:603/:503DataSummary.summary_frame/comparison_table_frames— per-run / cross-variant tables —:2061/:2121band_mask— half-open[lo, hi)step mask over RAW full-length indices —:1774payload_diff— NaN-aware key-by-key max-abs diff (the byte-identical refactor gate) —:227local_growth_rate—λ(t) = d/dt ln D(t), chaos vs ill-conditioning diagnostic —:276
Footguns
Most reductions reduce
|x|, but the signed members do NOT
signal_from_valuesabs-maps first, soMAX = max|x|. Theis_signedmembers (MIN,FRAC_BELOW,
PTP) read the raw signed sample:MINis the signed minimum,FRAC_BELOWtests< threshold,
PTP = max(x) − min(x)(notmax|x| − min|x|). Abs-mapping these would silently mis-count.
(analysis/INSIGHTS.md[math])
v_satis the infinity-norm clip flag, andv_maxmust ride on the spec
v_sat = (max_i |v_i| >= v_max)— component-wise-any, NOT the Euclidean norm.v_maxis per-run and is
not threaded from cfg to analysis; it must arrive on the metric spec (viamasked_reduction_items/
spec_with_overrides). A missingv_maxraises hard at reduction time. (analysis/INSIGHTS.md[config])
Band masks must come from a fresh full-length view, or step alignment is lost
band_maskis built over RAW indices from a freshLogView(view.log_entry), never the post-windowed /
extracted array. The caller ANDs it onto the active view withwith_mask(). The only hardcoded band is
SMIN_BAND = (0.025, 0.049)(derate onset / Tikhonov onset); every other mask field is YAML.
(analysis/INSIGHTS.md[io][config])
An empty (never-occurring) band returns NaN, not a crash
A masked array with 0 rows breaks numpy’s
-1reshape;series_matrixpasses the explicit column count
so the reduction returns NaN (an absent point). (analysis/INSIGHTS.md[footgun])
Unit
""notNaN, and tuple-ify list variant values
units_for_itemcoalesces to"": a NaN unit makespivot_table(dropna=True)silently drop the whole
row. Andcomparison_variant_headerstuple-ifies a listvariant_value(a multi-leaf sweep cascade)
beforedrop_duplicates, which cannot hash a list. (analysis/INSIGHTS.md[footgun])
Re-exports here are a deliberate cycle-break, not stray imports
statistics.py(inference) andmetric_catalog.py(catalog access) were split out Jun 18 and are
re-exported here with# noqa: E402,F401so the facade/tests keep importing fromdata_analyzer.
statistics.pyimports the handful of helpers it needs (finite_numeric_samples,percentile,
Reductions,longest_run_length) lazily to avoid a module-level cycle. (analysis/INSIGHTS.md[history])
Pseudocode (key → reported scalar)
spec = metric_spec(key) # from the catalog
values = values_for_spec(spec) # derived p_ce / v_sat, else raw log values
signal = paired? POINTING_ERROR|ERROR(values) # versine if spec.pointing, else actual−desired
: norm-or-first-component(values)
signal = signal if reduction.is_signed else |signal|
view = window_for_item(view, item) # time/index window AND band_mask (full-length)
value = reduction.apply(signal[view], threshold=spec.threshold) # the sweep/gate number
Equations & references
- Pointing versine
1 − cos θ(sheet §7 metrics) is implemented here asDeltas.POINTING_ERROR,
not in GNC — see[[current_sota#7]]and the MEASUREMENT contract. - The
SMIN_BANDsingularity band onsets (s_min_G) come from the derate stack[[current_sota#6]].
Related
orchestrator · plotter · star_reporter · logger · metric_catalog · statistics ·
signal_measurement · runner · breve_controller · terminology