Chance-Constrained Dynamic Programming with Application to Risk-Aware Robotic Space Exploration

Authors: Masahiro Ono, Marco Pavone, Yoshiaki Kuwata, J. (Bob) Balaram · Year: 2015 · Venue: Autonomous Robots, 39(4):555–571

Raw full-text conversion pending — the LAN/marker cluster is down (2026-07-04). Built from the abstract + §1 (intro/contributions) and §3 (problem + Boole reformulation) read in full, with §2 (preliminaries) and §4–5 (algorithm/experiments) read in part, from the author-hosted PDF via firecrawl (web.stanford.edu/~pavone/papers/Ono.Pavone.ea.AURO14.pdf); the scraped PDF's display equations are partly mangled, so equations below are transcribed conservatively. File the PDF to Docs/raw/pdf/ono2015chance.pdf and run marker when the cluster returns.

Summary

This is the canonical space-robotics precedent for joint chance-constrained planning with violation-probability semantics (co-authored by Pavone’s Stanford ASL and JPL). Standard constrained dynamic programming (and constrained MDP theory) requires constraints to share the additive structure of the cost — an expectation of a sum of one-stage terms — which a joint chance constraint does not have. The paper bridges this gap: it conservatively reformulates the joint chance constraint, via Boole’s inequality, into a bound on the expectation of a sum of indicator random variables (now additive), folds that into the cost through a Lagrangian dual, solves the primal for a fixed dual variable by ordinary Bellman recursion, and optimizes the single dual variable by Brent’s root-finding (exponentially convergent). Suboptimality bounds on the primal and dual objectives give rigorous stopping criteria. It is demonstrated on path planning, a Mars entry-descent-landing (EDL) problem, and a Lunar landing problem, using real Mars terrain with ~4 million discrete states per step (simulation only).

Key Claims

  • Joint chance constraints break additive constrained DP. Existing constrained-MDP tools assume additive constraint structure; a joint chance constraint (probability the whole trajectory stays feasible) is non-additive, so those tools do not apply directly.
  • Boole + indicator reformulation (conservative). Define if (state infeasible at step ), else . By Boole’s inequality, bounding is a sufficient condition for the joint chance constraint (Theorem 1: a feasible solution of the approximation is feasible for the original). The reformulated constraint is additive, hence a standard constrained MDP.
  • Small, quantified conservatism. Under an independent-failure assumption, the Boole gap is ; since mission risk bounds are small, the added conservatism is moderate, and (per prior work) much smaller than competing approximations.
  • Dual DP with root-finding. For a fixed dual variable , the Lagrangian is minimized by ordinary DP (Bellman recursion); the scalar dual variable is then optimized by Brent’s method, which converges exponentially and whose per-iteration complexity is independent of the primal problem size — typically 10–30 iterations.
  • Rigorous suboptimality bounds. Because the dual objective is generally non-differentiable, they derive explicit bounds on the suboptimality of both primal and dual objective values, yielding principled stopping criteria for the root-finder.
  • Validated on real space problems. Path planning, Mars EDL, and Lunar landing; Mars runs use real terrain with ~4M discrete states/step, supporting the claims of computational efficiency and low conservatism (near-optimality). All results are in simulation.
  • Violation-probability semantics only. The formulation bounds the probability of constraint violation and carries no notion of violation severity / tail magnitude — there is no CVaR or coherent-risk content. This is exactly the chance-constraint pole of the CVaR-vs-chance comparison.

Method

Discrete-time stochastic system , , , state fully observed, horizon . Objective: minimize over policy subject to the joint chance constraint at risk bound . The reformulation replaces the joint constraint by and applies Lagrangian duality; the dual function is concave even when the primal is nonconvex, and weak duality bounds the gap. The chain of inequalities (Theorem 1 proof) is: .

Regime. Planetary EDL / rover / lander guidance — a genuine spacecraft-GNC precedent, but not a manipulator and carrying no base-arm dynamic coupling; it is neither free-flying nor free-floating in our sense. Its transferable content is entirely at the risk/planning layer.

Relevance to thesis

This anchors the chance-constraint side of the CVaR-vs-chance-constraint comparison with a real space precedent: on-board risk management for space missions, formulated (as MER EDL was, at 91%/96% success bounds) as a bound on violation probability. Two things carry over to risk-aware view scoring. (i) The Boole/indicator + Lagrangian-dual DP machinery is a template for imposing a whole-sortie joint chance constraint on an inspection trajectory (many viewpoints, many keep-out surfaces) while keeping the per-step DP/Bellman structure — and it exposes the risk-allocation question (how Boole splits one budget across steps), the DP analogue of the allocation in ren2022chance. (ii) Its central limitation is the thesis motivation: chance constraints are blind to violation severity; later JPL CVaR work (Ahmadi/Ono) cites precisely this probability-only limitation as the reason to move to coherent/tail risk. For a free-flying manipulator inspecting under camera-pose uncertainty, where a severe collision is categorically worse than a marginal one, that is the gap CVaR (conditional_value_at_risk) fills.

Connections

Topics: chance_constraints · motion_planning · model_predictive_control · receding_horizon_control Sources: ren2022chance (per-mode chance constraints + risk allocation) · dixit2023risk (coherent-risk generalization) · majumdar2017how (why probability-only is tail-blind) · conditional_value_at_risk

Key Equations / Quotes

Joint chance constraint and its Boole/indicator sufficient condition (Problem 1 → Problem 2):

“we (conservatively) reformulate a joint chance constraint as a constraint on the expectation of a summation of indicator random variables, which can be incorporated into the cost function by considering a dual formulation… the primal variables can be optimized by standard dynamic programming, while the dual variable is optimized by a root-finding algorithm that converges exponentially.” (Abstract)

“risk requirements are formulated as bounds on the probability of success, as exemplified by the 91% and 96% bounds… for MER EDL.” (§1)

Open Questions

  • Boole’s inequality allocates one budget across steps; for a multi-view inspection sortie with many keep-out surfaces, how should the per-step/per-surface budget be split without compounding conservatism (cf. the allocation in ren2022chance)?
  • The method needs a discretized state space (4M states/step here); a 6-DOF free-flying base plus redundant arm has a far larger continuous state — does the dual-DP approach survive without prohibitive discretization?
  • What is the right CVaR analogue of this joint (whole-trajectory) chance constraint, and does the conservatism ordering survive a Boole-style decomposition of a CVaR budget?