Analysis Pipeline — Plan for YAML-Driven Float & Boolean Comparisons

Generated 2026-06-05, 02:34 (investigator skill). Report only — no code modified.

Question: How will the high-level requirements (YAML → polished comparison PDF) be implemented, modularly and future-proofed?

Scope — inspected: analysis/orchestrator.py, analysis/pre_run_loader.py, analysis/runner.py, analysis/star_reporter.py, analysis/data_analyzer.py, the comparison/figure YAMLs (ff_startup_timescale.yaml, cc_startup.yaml, test_float.yaml, default_figure_specs.yaml, default_run_specs.yaml, com_single_smoke.yaml, templates), and YAMLs_by_domain/metrics.yaml. plotter.py and logger.py are treated as standalone external nodes (do-what-they-are-told); claims about them are labelled inference.

Answer First — how it will be implemented

Keep the existing 5-stage linear contract and extend only at its seams. The pipeline is already modular: build_run_contexts → run_or_load_logs → analyze → plot → report (orchestrator.py:558-578). Float comparisons already work end-to-end through this path; the requirements are met by three small seam additions plus YAML, not by a new framework. This is the modularity win — do not re-tangle the stages.
Per-variant images (3-panel p_c/v_c/f_c sharex, and zoomed p_c) are pure YAML. The per-variant plot() loop already applies the top-level figures: block to every variant view and already supports multi-panel sharex and zoom.include_window (orchestrator.py:403-486); com_single_smoke.yaml proves the 3-panel shape. The two comparison YAMLs simply lack a figures: block — add it. The only code gap is that “first 10 s” can only be expressed today as an index/ratio window, not seconds (see Finding F4).
Boolean comparison needs one small fix: default variants to [false, true]. cc_startup.yaml deliberately comments out variants (“redundant for a Bool comparison”) but raw_variants() raises when they are absent (orchestrator.py:203-207). Boolean mode should synthesize the two-value sweep itself (Finding F1).
The “all variants on one axis” overlay is genuinely new — add it as a sibling stage plot_overall, not by editing plot(). Today plot() only emits per-variant figures; nothing overlays views. A new generic stage takes each variant’s view+label, builds one PlotAxis with a line per variant for a configured key + zoom, and calls the existing Plotter once. Driven by a new comparison.overlay YAML block. Generic (any key, any window) = future-proof (Finding F2).
The Introduction + per-variant section report is new — give Reporter a sectioned document model. Reporter.document_tex() is flat today (one Conclusions / Tables / Figures block, star_reporter.py:448-456) and comparison mode even discards the per-variant tables (orchestrator.py:340-345). Introduce an ordered list of Section(title, tables, figures); the orchestrator assembles Introduction (comparison table + overlay images) and one section per variant; Reporter just renders sections in order. The current flat report is the degenerate single-section case (Finding F3).
Metric/include:true filtering is already centralized — reuse it. metrics.yaml include flags flow through included_logged_metric into both summary and comparison tables (data_analyzer.py:655-665, 1500). No new metric logic; the report tables consume this as-is.

Flow Summary

run_spec YAML (+ metrics.yaml) is loaded and fanned out into one context per variant (the comparison flag+value becomes a validated config override). Each context is run live (or loaded from npz) into a Logger, wrapped as a LogView, and reduced by DataSummary into tables and figure specs. data_analyzer is the non-linear exception: it is the shared analysis library feeding both the table branch and the figure branch, and it is reused again for the overlay. Plotter renders figure specs to images; Reporter assembles tables + images into report.pdf. Logger/LogStore and Plotter are standalone leaf nodes.

flowchart LR
    Y["run_spec YAML + metrics.yaml"] --> CTX["build_run_contexts (per variant)"]
    CTX --> RUN["Runner.run / load npz"]
    RUN --> DA["data_analyzer: LogView + DataSummary"]
    DA --> TBL["tables: per-variant + comparison"]
    DA --> FIG["figure_specs: per-variant + overlay"]
    FIG --> PLOT["Plotter"]
    TBL --> REP["Reporter: sectioned -> report.pdf"]
    PLOT --> REP
    LOG["Logger / LogStore"] -.-> RUN
    classDef standalone fill:#dde7ff,stroke:#5566aa,color:#111;
    classDef exception fill:#fff3cd,stroke:#d39e00,color:#3d2b00;
    class LOG,PLOT standalone;
    class DA exception;

Legend: yellow = data_analyzer, the branch that breaks the otherwise-linear flow; blue dashed = standalone Plotter/Logger nodes (inner functions not explored). Bold-new work lands on FIG (overlay), TBL (thread per-variant tables), and REP (sectioning).

Key Functions (call order)

Orchestrator.run — the 5-stage spine; where a plot_overall stage and a sectioned report call slot in (orchestrator.py:558).
build_run_contexts → comparison_context — fan-out per variant; reads comparison.mode/flag/variants; raw_variants() rejects empty variants (the Boolean blocker) (orchestrator.py:177-207, 217-265).
RunSpecLoader.prepare / set_config_value — merges defaults and applies the per-variant dotted-path override (validated, fail-loud) (pre_run_loader.py:196-250).
Runner.run → make_controller — live run dispatch by controller.name (com here) (runner.py:69-118); writes via standalone Logger (external).
Orchestrator.analyze → analyze_one / comparison_table_frames — per-variant LogView+tables, then the across-variant pivot; per-variant tables are dropped in comparison mode today (orchestrator.py:338-387).
DataSummary.summary_metrics / comparison_table_frames — include:true filtering and the comparison pivot (data_analyzer.py:1436-1527).
Orchestrator.plot → figure_spec / figure_log_view — per-variant figures; figure_log_view wires only ratio/indices zoom, no seconds (orchestrator.py:403-486).
SignalAnalysis.plot_axis / key_plot_lines — builds PlotAxis/PlotLine from one view; the overlay reuses this per variant with a variant label/style (data_analyzer.py:1045-1100).
Plotter.plot_configured_figures + figure_spec_from_fields — external/standalone; renders whatever axes it is given (inference: can render multi-line overlay axes; no Plotter change needed) (orchestrator.py:409-419).
Orchestrator.report → Reporter.document_tex — flat report assembly; the sectioning target (orchestrator.py:539-556, star_reporter.py:448-486).

Findings

F1 — Boolean comparison cannot omit variants (contradicts the in-file note). Evidence: cc_startup.yaml:13-15 comments out variants with # CLAUDE: Eliminate if possible - redundant for a Bool comparison, but raw_variants() raises "Comparison specs must define variants." (orchestrator.py:204-207). Impact: the required Boolean comparison fails to run as written. Next step: in build_run_contexts, when mode == BOOLEAN and variants are empty, synthesize [false, true].

F2 — No “overlay across variants” figure stage. Evidence: plot() loops analysis.analyses and emits per-variant figures only; nothing combines views (orchestrator.py:403-419). Impact: the “zoomed p_c, all variants on one axis” image cannot be produced. Next step: add plot_overall(analysis) building one PlotAxis with a PlotLine per variant (label = variant), driven by a comparison.overlay YAML block (key + zoom); render via existing Plotter.

F3 — Report is flat; no Introduction / per-variant sections; per-variant tables discarded. Evidence: document_tex emits fixed Conclusions/Tables/Figures (star_reporter.py:448-486); comparison analyze() keeps only comparison_table_frames, dropping each AnalysisWork.tables (orchestrator.py:340-345). Impact: the required “Introduction + a section per variant” structure is unsupported, and variant tables never reach the report. Next step: add a Section(title, tables, figures) model rendered in order by Reporter; orchestrator assembles Introduction (comparison table + overlay figs) + one section per variant (that variant’s tables + figs). Flat path = one default section (backwards compatible).

F4 — Figure zoom supports ratio/indices only, not seconds. Evidence: figure_log_view handles zoom.include_window.ratio/indices; LogView.include_between (seconds) exists but is unwired here (orchestrator.py:479-486, data_analyzer.py:452-454). Impact: “first 10 s” must be hand-computed as a ratio and silently breaks if debug_time_limit changes. Next step: wire zoom.include_window.seconds: [a, b] to include_between for duration-robust zoom.

F5 — Standing cleanup / duplication / stale spec fields (fold into tasks). - plot_kind: reduction appears in every comparison YAML but is never read in orchestrator.py (grep: only plot_both matches). Stale, or consumed only by external Plotter (inference) — confirm and drop or document. - Duplicated summary_group_for_spec, defined identically on SignalAnalysis (data_analyzer.py:612-617) and DataSummary (data_analyzer.py:1445-1450) — collapse to one owner. - Dead comment # result = self.run_single(...) (orchestrator.py:300). - conclusions_from_tables grabs the first table’s first value cell as the run “conclusion” (orchestrator.py:389-394) — brittle; revisit under the section model. - ReportFigure carries a TODO to render its own .tex section (star_reporter.py:413) — natural to do alongside F3 so Section items self-render.

Proposed Next Tasks

Order: T1 → T2 → T3 → T4, then cleanup T5. Each is independently testable (py_compile + a focused smoke), per CLAUDE.md “smallest convincing validation.”

T1 — Author the two input YAMLs (no code) and default Boolean variants

Body: Set ff_startup_timescale.yaml comparison.variants to [0.5, 0.1, 0.05, 0] and add a figures: block: figure A = three panels [p_c],[v_c],[f_c] with sharex: true; figure B = one panel [p_c] with a first-10 s zoom. Add the same figures: block to cc_startup.yaml (keys p_c, v_c, f_c). Then make build_run_contexts synthesize [false, true] when mode == BOOLEAN and variants is empty (Finding F1), honoring the cc_startup note.
Modify: analysis/orchestrator.py (Boolean variant default); analysis/analysis_YAMLs/ff_startup_timescale.yaml; analysis/analysis_YAMLs/cc_startup.yaml.
Inspect, do not modify: analysis/pre_run_loader.py, YAMLs_by_domain/metrics.yaml.
Create: none.
Validation: Orchestrator("cc_startup").build_run_contexts() yields two contexts (false,true) with no variants: in the YAML; ff_startup_timescale yields four. py_compile passes.

T2 — Add the `plot_overall` overlay stage (Finding F2)

Body: Add Orchestrator.plot_overall(analysis) that, for a configured comparison.overlay block (key + zoom window), builds one PlotAxis with one PlotLine per variant view (label = variant label), one figure_spec, and calls the existing Plotter once. Call it from run() after plot(); return its paths alongside the per-variant paths. No Plotter edits.
Modify: analysis/orchestrator.py.
Inspect, do not modify: analysis/data_analyzer.py (reuse key_plot_lines/plot_axis), analysis/plotter.py (entry points only).
Create: none (overlay block lives in the T1 YAMLs).
Validation: focused smoke over two stored logs produces one overlay PNG containing N variant lines for p_c, zoomed.

T3 — Sectioned report model (Finding F3) + seconds zoom (Finding F4)

Body: Give Reporter an ordered Section(title, tables, figures) list and render sections in order; keep the flat path as a single default section. Thread each variant’s AnalysisWork.tables through analyze() so they survive comparison mode. In Orchestrator.report, assemble Introduction (comparison table + overlay figures) + one section per variant (that variant’s tables + per-variant figures). Wire zoom.include_window.seconds to LogView.include_between for the first-10 s zoom.
Modify: analysis/star_reporter.py, analysis/orchestrator.py.
Inspect, do not modify: analysis/data_analyzer.py.
Create: none.
Validation: reporter_smoke-style test renders a .tex with an Introduction section plus one section per variant, each carrying its own tables and figures; existing flat smoke still passes.

T4 — End-to-end comparison smoke for both YAMLs

Body: Run ff_startup_timescale and cc_startup end-to-end (prefer from_log/stored npz or a short debug_time_limit) and confirm: per-variant 3-panel + zoomed-p_c images, one overlay image, the comparison reduction table (metrics with include:true), and a sectioned report. Capture one artifact under generated_reports/analysis/.
Modify: none (a runnable smoke spec/section may live in the YAMLs).
Inspect, do not modify: all stage modules.
Create: generated_reports/analysis/comparison_pipeline_smoke.md (artifact log).
Validation: both reports compile (or .tex renders if LaTeX absent); image and table counts match the requirement list.

T5 — Standing cleanup (Finding F5)

Body: Remove/confirm the unused plot_kind spec field across comparison YAMLs; collapse the duplicated summary_group_for_spec to one owner; delete the dead run_single comment; let ReportFigure/Section items self-render their .tex (retires the star_reporter.py:413 TODO); reassess conclusions_from_tables under the section model. Behavior-preserving.
Modify: analysis/orchestrator.py, analysis/data_analyzer.py, analysis/star_reporter.py, comparison YAMLs.
Inspect, do not modify: analysis/plotter.py (to confirm whether plot_kind is read there before dropping).
Create: none.
Validation: all module smokes (orchestrator_smoke, reporter_smoke, data_analyzer smokes) still pass; grep shows no remaining plot_kind readers.

No code was modified during this investigation. plotter.py/logger.py internals were not explored; statements about them are labelled inference.

Result Summary

T1 done (2026-06-05). Files edited: - analysis/orchestrator.py — raw_variants(mode) now synthesizes [False, True] when mode == BOOLEAN and variants is empty (Finding F1); build_run_contexts passes mode. - analysis/analysis_YAMLs/ff_startup_timescale.yaml — variants: [0.5, 0.1, 0.05, 0]; added figures: (3-panel p_c/v_c/f_c sharex + p_c first-10s zoom). - analysis/analysis_YAMLs/cc_startup.yaml — added the same figures: block (variants stay commented out). Validated: py_compile OK; cc_startup → 2 contexts (false,true) with no variants:; ff_startup_timescale → 4. Notes / follow-ups: (1) zoom uses ratio: [0.0, 0.3333] (= 10s/30s) because seconds-based zoom is T6/F4 — revisit there. (2) cc_startup keeps save_figures/save_report: false, so its figures stay inert until T7 enables outputs. (3) all_tasks.md has a duplicated TASK_4–8 block; marked the first TASK_4 only — worth de-duplicating.

Result Summary

T5 done (2026-06-05). File edited: analysis/orchestrator.py (only). - New stage plot_overall(analysis) + helpers (comparison_overlay, overlay_figure_spec, overlay_variant_line, zoom_view, overall_figure_dir): builds one PlotAxis with one PlotLine per variant view (label = variant) for the configured comparison.overlay key, applies its zoom, renders one figure via the existing Plotter (no Plotter edits). No-ops when overlay is unset or save_figures is off. - Wired into run() after plot(); overlay paths concatenated into figure_paths. Added overlay_smoke() to __main__. - zoom_view reads {ratio|indices} tolerantly via object_fields (overlay block isn’t in the merged figure schema) — minor overlap with figure_log_view, noted for T8 cleanup. Validated: py_compile OK; overlay_smoke over two stored logs → one overlay_p_c.png (43 KB) with 2 zoomed variant lines; existing orchestrator_smoke still passes. Follow-up — overlay block not yet in YAMLs (TASK_5 is orchestrator-only):

TITLE: Add comparison.overlay block to the two comparison YAMLs BODY: Add comparison.overlay: {key: p_c, zoom: {include_window: {ratio: [0.0, 0.3333]}}} to ff_startup_timescale.yaml and cc_startup.yaml so plot_overall activates in real runs. Fold into T7’s end-to-end smoke if convenient. Modify: analysis/analysis_YAMLs/ff_startup_timescale.yaml, analysis/analysis_YAMLs/cc_startup.yaml. Validation: Orchestrator(stem).comparison_overlay() returns the block; an end-to-end run emits overlay_p_c.png.

Result Summary

T6 done (2026-06-05). Files edited: analysis/star_reporter.py, analysis/orchestrator.py. - star_reporter.py: new ReportSection(title, tables, figures) + report_sections() normalizer; Reporter gains a sections= arg and renders sections in order (section_block); flat tables/figures path kept as the single-section default. Added sectioned_reporter_smoke(). - orchestrator.py: plot() now returns one Package(label, context, figure_paths) per variant (+ flat_figure_paths helper) so figures stay with their owner; report()/build_reporter()/report_sections() assemble Introduction (comparison table + overlay figs) + one section per variant (its tables + per-variant figs) for comparisons, flat for single runs; figure_log_view and zoom_view gained a tolerant seconds branch → LogView.include_between (F4). - Per-variant tables already survived via analysis.analyses[i].tables; report now consumes them (no analyze() change needed). Validated: py_compile OK; flat reporter_smoke + new sectioned_reporter_smoke pass; orchestrator/overlay smokes pass; end-to-end probe renders Intro + 2 variant sections and a seconds:[0,10] view clips to 9.99 s. Follow-up — YAMLs still use ratio: [0.0, 0.3333] (seconds wired but YAMLs not in T6 scope); switch figure B + overlay to seconds: [0, 10] in T7/T8 for duration-robustness.

Result Summary

T7 done (2026-06-05). Both comparisons run live end-to-end → real PDFs. Artifact: generated_reports/analysis/comparison_pipeline_smoke.md. - ff_startup_timescale: 4 variants → 8 per-variant figs + 1 overlay, 2 comparison tables, Introduction + 4 variant sections, 684 KB PDF. cc_startup: 2 variants → 4 + 1 figs, Intro + 2 sections, 384 KB PDF. - Smoke-spec edits in the YAMLs (allowed by T7): added comparison.overlay (key p_c, seconds:[0,10]) and switched figure B to seconds:[0,10] in both; cc_startup outputs enabled (figures/report/pdf); fixed ff flag → controller.com.feedforward.startup.time_scale (was missing the controller.com. prefix and failed the override resolver). - Blocker found + resolved per user: exp_scale(t, tau) divides by zero at tau=0 (hit by ff’s 0 variant and cc’s enable=true via controller.com.startup.time_scale=0.0). Per user instruction, set those to 0.001 (ff variant 0→0.001; parameters.yaml controller.com.startup.time_scale 0.0→0.001) instead of guarding exp_scale. Validated: both reports compile; image/table counts match the requirement list (9/13 and 5/7 includegraphics/tables). Follow-ups for T8: (1) cleaner root fix exp_scale(tau<=0)→1.0; (2) comparison pivot excludes paired tracking errors (include_logged=False) — decide if p_c/v_c reductions should appear cross-variant.

Result Summary

T8 done (2026-06-05). Files edited: analysis/star_reporter.py, analysis/data_analyzer.py, analysis/orchestrator.py, comparison YAMLs. - Minipages (star_reporter.py): each sectioned-report \section now lays tables and figures in two side-by-side 0.45\textwidth minipages, each with its own \captionof{table/figure} + \label; figures use width=\linewidth,height=0.45\textheight,keepaspectratio (full image, bounded height); added \usepackage{caption}. ReportTable/ReportFigure/ReportSection now self-render (minipage_tex/render), retiring the ReportFigure TODO. - Cleanup: removed dead plot_kind from all comparison YAMLs (grep-clean, no code readers); collapsed duplicated summary_group_for_spec to one module-level owner in data_analyzer.py; deleted the dead run_single comment in orchestrator.py; reassessed conclusions_from_tables (kept, behavior-preserving, commented). Validated: py_compile OK; orchestrator_smoke, flat+sectioned reporter_smoke, and data_analyzer smokes pass; both comparisons compile end-to-end (ff: 10 minipages/680 KB PDF, cc: 6 minipages/380 KB PDF); grep shows no remaining plot_kind. Open follow-ups (not in T8 scope): root exp_scale(tau<=0)→1.0 guard, and whether the cross-variant pivot should include paired tracking errors.

Analysis Pipeline — Plan for YAML-Driven Float & Boolean Comparisons

Answer First — how it will be implemented

Flow Summary

Key Functions (call order)

Findings

Proposed Next Tasks

T1 — Author the two input YAMLs (no code) and default Boolean variants

T2 — Add the plot_overall overlay stage (Finding F2)

T3 — Sectioned report model (Finding F3) + seconds zoom (Finding F4)

T4 — End-to-end comparison smoke for both YAMLs

T5 — Standing cleanup (Finding F5)

Result Summary

Result Summary

Result Summary

Result Summary

Result Summary

T2 — Add the `plot_overall` overlay stage (Finding F2)