Risk-averse Receding Horizon Motion Planning for Obstacle Avoidance using Coherent Risk Measures

Authors: Dixit, Ahmadi, Burdick · Year: 2023 · Venue: Artificial Intelligence (Elsevier) / arXiv
Raw: md

Summary

The paper develops a risk-averse model predictive control (MPC) framework for dynamic obstacle avoidance of a discrete-time linear system subject to both additive process noise and measurement noise (uncertain obstacle rotation/translation). Constraints and cost are posed as bounds on a general coherent risk measure (CVaR, EVaR, total-variation distance, and other f-divergence / g-entropic measures), exploiting the dual “worst-case expectation over a risk envelope” representation. The non-convex risk-constrained problem is reformulated, via convex conjugacy and a Big-M disjunctive encoding of the non-convex safe set, into a convex mixed-integer program with constraint-tightening that removes the exponential-in-horizon blow-up. The authors prove risk-sensitive recursive feasibility and finite-time task completion (each in probability/confidence) and validate on 2D numerical examples.

Key Claims

Any coherent risk constraint satisfying the dual representation (Assumption 4) admits an exact reformulation (Lemma 2) as a finite convex program in dual variables $(λ_{1}, λ_{2}, ν, h_{l, k})$ using the convex conjugate $g^{*}$ of the risk-envelope describing function $g$ .
Constraint tightening (Lemma 1) makes state/control/terminal risk constraints depend only on $ρ (∣ δ ∣)$ , which is disturbance-policy-independent and computable offline, avoiding online evaluation that would cost $O ((J_{δ})^{N})$ constraints.
Introducing the auxiliary worst-case variable $δ_{m a x, k}$ with $P (δ_{m a x, k} \leq x) = P (∣ δ_{1} ∣ \leq x)^{k}$ gives a conservative inner approximation that reduces the obstacle-constraint sample cardinality from $J = (J_{δ})^{k} J_{o}$ to $J = J_{δ} J_{o}$ , so constraint count no longer grows exponentially with horizon. The approximation is exact at $k = 1$ .
Recursive feasibility (Prop. 1): if feasible at $t$ it is feasible at $t + 1$ with confidence $α$ ; the infeasibility probability is bounded via Cantelli’s inequality and the union bound, $P {infeasible} \leq 1 - α$ after Bonferroni risk-allocation $1 - α^{'} = (1 - α) / (L + 1)$ .
Finite-time task completion (Prop. 2): the cost decreases by at least 1 per step (via a binary task-completion state $ψ$ ), so waypoint $w_{K}$ is reached in at most $⌈ J_{t}^{*} ⌉$ steps per leg, with overall confidence $α^{J_{0}^{K - 1}}$ .
Empirically (2D system, 50 Monte Carlo runs): ordering $VaR \leq CVaR \leq EVaR$ and $CVaR \leq TVD$ holds; higher $α$ (more conservative) lowers infeasibility (e.g. CVaR 5.3%→2.7%, EVaR→0%, TVD 0% at $α = 0.9$ ); exact CVaR cost prioritizes task completion but is ~13× slower (83.68 s vs 6.32 s/iteration).

Method

Linear discrete-time plant (Eq. eq:sys):
$x (t + 1) = A x (t) + B u (t) + D δ (t), y (t) = C x (t) .$
Process noise $δ$ is i.i.d. from a discrete pmf $p_{δ}$ (Assumption 2); full-state measurement and full-rank $D$ (Assumption 1). Each obstacle $\overset{ˉ}{O}_{l}$ is a convex polytope subject to random rotation $R_{l}$ and translation $w_{l}$ drawn from a discrete joint pmf $p_{l}$ (Assumption 3). The safety constraint bounds the coherent risk of the distance-to-safe-set, $ρ_{1 - α} [ζ (y (t), S_{l} (t))] \leq ϵ_{l}$ .

Control is a simplified affine disturbance feedback (SADF) policy, $u_{k} = \sum_{m = 0}^{k - 1} K_{k - m} δ_{m} + η_{k}$ , with decision variables ${K_{N}, η_{N}}$ ; this is less conservative than open-loop MPC (a special case with $K_{i} = 0$ ) and is linear in the optimization variables (unlike affine state feedback). Coherent risk measures (Definition 1: monotonicity, translation invariance, positive homogeneity, subadditivity) are used in dual form (Definition 2 / Assumption 4):
$ρ (X) = sup_{Q \in Q} E_{Q} (X), Q \subset {Q ≪ P} convex, closed (risk envelope) .$
Specific envelopes are given for CVaR (Radon–Nikodym bound $d Q / d P \leq 1/ (1 - α)$ ), EVaR (KL-divergence epigraph, exponential cone), and TVD (total-variation ball). The non-convex safe set is handled with a Big-M disjunctive reformulation introducing binaries $γ_{i}^{j} \in {0, 1}$ , yielding a convex MIP whose local optima still satisfy the original risk constraints.

Regime: This is a generic linear-system MPC obstacle-avoidance paper; it is neither free-flying nor free-floating — it carries no manipulator dynamics, no base-arm dynamic coupling, and no reaction/momentum bookkeeping. The illustrative platform is a planar drone. Relevance to a free-flying space manipulator is purely at the risk/planning layer, not the dynamics model.

Relevance to thesis

This is a template for the risk layer sitting atop nominal guidance/control of the free-flying manipulator. Three transferable ideas: (i) the general coherent-risk dual reformulation lets us swap CVaR/EVaR/TVD without re-deriving the optimizer — useful for trading conservatism against feasibility on collision-risk constraints during inspection; (ii) the offline constraint-tightening on $ρ (∣ δ ∣)$ and the $δ_{m a x, k}$ trick are practical tools to keep a receding-horizon problem tractable under disturbance growth; (iii) the recursive-feasibility / finite-time-completion proofs (Cantelli bound + Bonferroni risk allocation) give a probabilistic safety certificate structure we could adapt. Caveat: the linear-plant assumption is restrictive — a 6-DOF free-flying base plus arm is strongly nonlinear and coupled, so the dynamics would need linearization or a different propagation of risk through the manipulator Jacobian before these results apply.

Connections

Topics: conditional_value_at_risk, coherent_risk_measures, chance_constraints, model_predictive_control

Key Equations / Quotes

“Coherent risk measures can be expressed as a distributionally-robust expectation, i.e, the risk is equivalently expressed as the worst-case expectation over a convex, closed set of distributions.” (Introduction)

CVaR (Eq. eq:cvar_def):
$CVaR_{1 - α} (X) := in f_{z \in R} E [z + \frac{( X - z ) ^{+}}{1 - α}] .$

EVaR (Eq. eq:evar):
$EVaR_{1 - α} (X) := in f_{z > 0} [z^{- 1} ln \frac{E [ e ^{X z} ]}{1 - α}], lim_{α \to 1} EVaR_{1 - α} (X) = ess sup (X) .$

Conjugate-form safety reformulation (Lemma 2 / Eq. eq:conjugate):
$min_{λ_{1}, λ_{2}, ν} λ_{2} g^{*} (λ_{2}^{- 1} (p (h_{l, k}^{*} + ν) + λ_{1})) - ν, λ_{1} ⪰ 0, λ_{2} \geq 0.$

Tightened state constraint (Lemma 1, Eq. eq:state_tighten):
$f_{x, n}^{⊤} (A^{k} x_{0} + B_{k} η_{k}) + f_{x, n}^{⊤} (B_{k} K_{k} + D_{k})_{1} ρ (∣ δ ∣) \leq ϵ_{x} + g_{x, n} .$

Risk allocation (Remark 1): $1 - α^{'} = (1 - α) / (L + 1)$ .

Open Questions

$N$ -step reachability of the waypoints from the higher-level (A*/RRT) planner is assumed, not analyzed; the authors flag it as future work. How would this interact with a nonholonomic-free attitude path for a 6-DOF base?
Extension beyond linear, time-invariant plants: the footnote notes that LTV systems break the SADF equivalence and require the non-simplified disturbance-feedback policy — what is the cost/feasibility penalty?
The recursive-feasibility bound (Cantelli) is admittedly loose; the paper reports actual infeasibility far below the bound but the tightness is “unknown.” Can a sharper certificate be obtained for specific risk measures?
All uncertainty is modeled as finite discrete pmfs; scaling to high-cardinality or continuous disturbance distributions (relevant to real sensor/actuator noise on a spacecraft) is not addressed.

Quartz 5

Explorer

dixit2023risk