Risk-averse Receding Horizon Motion Planning for Obstacle Avoidance using Coherent Risk Measures
Authors: Dixit, Ahmadi, Burdick · Year: 2023 · Venue: Artificial Intelligence (Elsevier) / arXiv
Raw: md
Summary
The paper develops a risk-averse model predictive control (MPC) framework for dynamic obstacle avoidance of a discrete-time linear system subject to both additive process noise and measurement noise (uncertain obstacle rotation/translation). Constraints and cost are posed as bounds on a general coherent risk measure (CVaR, EVaR, total-variation distance, and other f-divergence / g-entropic measures), exploiting the dual “worst-case expectation over a risk envelope” representation. The non-convex risk-constrained problem is reformulated, via convex conjugacy and a Big-M disjunctive encoding of the non-convex safe set, into a convex mixed-integer program with constraint-tightening that removes the exponential-in-horizon blow-up. The authors prove risk-sensitive recursive feasibility and finite-time task completion (each in probability/confidence) and validate on 2D numerical examples.
Key Claims
- Any coherent risk constraint satisfying the dual representation (Assumption 4) admits an exact reformulation (Lemma 2) as a finite convex program in dual variables using the convex conjugate of the risk-envelope describing function .
- Constraint tightening (Lemma 1) makes state/control/terminal risk constraints depend only on , which is disturbance-policy-independent and computable offline, avoiding online evaluation that would cost constraints.
- Introducing the auxiliary worst-case variable with gives a conservative inner approximation that reduces the obstacle-constraint sample cardinality from to , so constraint count no longer grows exponentially with horizon. The approximation is exact at .
- Recursive feasibility (Prop. 1): if feasible at it is feasible at with confidence ; the infeasibility probability is bounded via Cantelli’s inequality and the union bound, after Bonferroni risk-allocation .
- Finite-time task completion (Prop. 2): the cost decreases by at least 1 per step (via a binary task-completion state ), so waypoint is reached in at most steps per leg, with overall confidence .
- Empirically (2D system, 50 Monte Carlo runs): ordering and holds; higher (more conservative) lowers infeasibility (e.g. CVaR 5.3%→2.7%, EVaR→0%, TVD 0% at ); exact CVaR cost prioritizes task completion but is ~13× slower (83.68 s vs 6.32 s/iteration).
Method
Linear discrete-time plant (Eq. eq:sys):
Process noise is i.i.d. from a discrete pmf (Assumption 2); full-state measurement and full-rank (Assumption 1). Each obstacle is a convex polytope subject to random rotation and translation drawn from a discrete joint pmf (Assumption 3). The safety constraint bounds the coherent risk of the distance-to-safe-set, .
Control is a simplified affine disturbance feedback (SADF) policy, , with decision variables ; this is less conservative than open-loop MPC (a special case with ) and is linear in the optimization variables (unlike affine state feedback). Coherent risk measures (Definition 1: monotonicity, translation invariance, positive homogeneity, subadditivity) are used in dual form (Definition 2 / Assumption 4):
Specific envelopes are given for CVaR (Radon–Nikodym bound ), EVaR (KL-divergence epigraph, exponential cone), and TVD (total-variation ball). The non-convex safe set is handled with a Big-M disjunctive reformulation introducing binaries , yielding a convex MIP whose local optima still satisfy the original risk constraints.
Regime: This is a generic linear-system MPC obstacle-avoidance paper; it is neither free-flying nor free-floating — it carries no manipulator dynamics, no base-arm dynamic coupling, and no reaction/momentum bookkeeping. The illustrative platform is a planar drone. Relevance to a free-flying space manipulator is purely at the risk/planning layer, not the dynamics model.
Relevance to thesis
This is a template for the risk layer sitting atop nominal guidance/control of the free-flying manipulator. Three transferable ideas: (i) the general coherent-risk dual reformulation lets us swap CVaR/EVaR/TVD without re-deriving the optimizer — useful for trading conservatism against feasibility on collision-risk constraints during inspection; (ii) the offline constraint-tightening on and the trick are practical tools to keep a receding-horizon problem tractable under disturbance growth; (iii) the recursive-feasibility / finite-time-completion proofs (Cantelli bound + Bonferroni risk allocation) give a probabilistic safety certificate structure we could adapt. Caveat: the linear-plant assumption is restrictive — a 6-DOF free-flying base plus arm is strongly nonlinear and coupled, so the dynamics would need linearization or a different propagation of risk through the manipulator Jacobian before these results apply.
Connections
Topics: conditional_value_at_risk, coherent_risk_measures, chance_constraints, model_predictive_control
Key Equations / Quotes
“Coherent risk measures can be expressed as a distributionally-robust expectation, i.e, the risk is equivalently expressed as the worst-case expectation over a convex, closed set of distributions.” (Introduction)
CVaR (Eq. eq:cvar_def):
EVaR (Eq. eq:evar):
Conjugate-form safety reformulation (Lemma 2 / Eq. eq:conjugate):
Tightened state constraint (Lemma 1, Eq. eq:state_tighten):
Risk allocation (Remark 1): .
Open Questions
- -step reachability of the waypoints from the higher-level (A*/RRT) planner is assumed, not analyzed; the authors flag it as future work. How would this interact with a nonholonomic-free attitude path for a 6-DOF base?
- Extension beyond linear, time-invariant plants: the footnote notes that LTV systems break the SADF equivalence and require the non-simplified disturbance-feedback policy — what is the cost/feasibility penalty?
- The recursive-feasibility bound (Cantelli) is admittedly loose; the paper reports actual infeasibility far below the bound but the tightness is “unknown.” Can a sharper certificate be obtained for specific risk measures?
- All uncertainty is modeled as finite discrete pmfs; scaling to high-cardinality or continuous disturbance distributions (relevant to real sensor/actuator noise on a spacecraft) is not addressed.