How Should a Robot Assess Risk? Towards an Axiomatic Theory of Risk in Robotics

Authors: Majumdar, Pavone · Year: 2017 · Venue: International Symposium on Robotics Research (ISRR), “Blue Sky” session (preprint, arXiv:1710.11040)
Raw: md

Summary

The paper proposes an axiomatic framework for choosing a risk metric in robotics, paralleling the finance community’s development of coherent risk metrics. It argues that any sensible risk metric should satisfy six axioms (A1–A6), and proves that the class of metrics satisfying all six is precisely the class of distortion risk metrics (equivalently spectral risk measures), each of which is a weighted average of Conditional Value at Risk (CVaR) over confidence levels. It further argues that for sequential (multi-period) decision making, risk metrics must additionally be time-consistent, which forces a nested/compounded structure of one-step metrics rather than a single metric applied to the cumulative cost.

Key Claims

The widely used mean-variance metric $E [Z] + β Var [Z]$ violates monotonicity (A1): a worked 4-outcome example shows it strictly prefers a controller $π^{'}$ whose cost dominates $π$ in every outcome.
Value at Risk (VaR) violates subadditivity (A4): in a 3-outcome example with a $1 0^{10}$ tail cost, $VaR_{0.3} (Z^{'}) = 1.99 < VaR_{0.3} (Z) = 2$ , so VaR prefers the catastrophic-tail option; CVaR correctly prefers $Z$ .
Representation theorem (A1–A4, coherent): every coherent risk metric is $ρ (Z) = max_{p \in P} E_{p} [Z]$ for a compact convex risk envelope $P$ of probability mass functions — i.e. coherence is equivalent to distributional robustness.
Representation theorem (A1–A6, distortion): $ρ$ satisfies A1–A6 iff $ρ (Z) = \int_{0}^{1} CVaR_{α} (Z) ν (d α)$ for some probability measure $ν$ on $[0, 1]$ . CVaR, expected cost, and worst-case are all special cases; A1–A6 thus span the full spectrum from risk-neutral to worst-case.
Chance constraints are equivalent to a VaR constraint ( $VaR_{α} (Z) \leq 0 \Leftrightarrow P [Z > 0] \leq α$ ) and capture only boolean events; they are insensitive to tail magnitude above the threshold.
Time consistency: applying a static risk metric to the cumulative cost is generally not time-consistent (scenario-tree counterexample with $α = 2/3$ ). Time-consistent dynamic metrics must be built by compounding one-step metrics; compounding distortion metrics additionally yields the local property.

Method

The setup is decision-theoretic, not dynamical. Outcomes $ω \in Ω$ ( $Ω$ finite) carry a probability mass $P$ , a cost random variable $Z : Ω \to R$ , and a risk metric $ρ : Z \to R$ . Costs are assumed expressed in monetary terms (the “Robot Certification Agency” deposit interpretation: $ρ (Z)$ is the smallest amount that makes the task risk-free, since A2 gives $ρ (Z - ρ (Z)) = 0$ ).

The six axioms:

A1 Monotonicity: $Z (ω) \leq Z^{'} (ω) \forall ω \Rightarrow ρ (Z) \leq ρ (Z^{'})$ .
A2 Translation invariance: $ρ (Z + c) = ρ (Z) + c$ , $c \in R$ .
A3 Positive homogeneity: $ρ (β Z) = β ρ (Z)$ , $β \geq 0$ .
A4 Subadditivity: $ρ (Z + Z^{'}) \leq ρ (Z) + ρ (Z^{'})$ (encourages diversification/redundancy).
A5 Comonotone additivity: if $Z, Z^{'}$ comonotone, $ρ (Z + Z^{'}) = ρ (Z) + ρ (Z^{'})$ .
A6 Law invariance: $Z, Z^{'}$ identically distributed $\Rightarrow ρ (Z) = ρ (Z^{'})$ .

A3+A4 imply convexity. A1–A4 = coherent; A1–A5 = comonotonic (Choquet-integral representation w.r.t. a submodular, monotone, normalized set function $g$ ); A1–A6 = distortion.

This paper is regime-agnostic: it concerns the abstract risk layer, not manipulator dynamics, so the free-flying vs free-floating distinction does not arise in the model. It is relevant to whatever cost random variable a planner emits for a free-flying space manipulator.

Relevance to thesis

This is a foundational justification document for the risk layer of the thesis. It supplies the rigorous argument for why CVaR (and more generally distortion metrics) should be preferred over mean-variance or chance constraints when making a free-flying manipulator risk-aware: mean-variance can pick a dominated trajectory, and chance constraints ignore the magnitude of catastrophic tails (e.g. a high-energy collision with the inspected target). For sequential guidance/planning it warns that naively applying CVaR to cumulative path cost is time-inconsistent; the time-consistent construction requires nested one-step risk metrics, which has direct implications for how a risk-aware MDP/optimal-control formulation for the manipulator should be posed.

Connections

Topics: conditional_value_at_risk, coherent_risk_metrics, chance_constraints, time_consistency

Key Equations / Quotes

“would a passenger riding in an autonomous car be happy to do so if she was told that the average behavior of the car is not to crash?” (Sec. 1)

VaR (Eq. for VaR):
$VaR_{α} (Z) := min {z ∣ P [Z > z] \leq α} .$

CVaR:
$CVaR_{α} (Z) := \frac{1}{α} \int_{1 - α}^{1} VaR_{1 - τ} (Z) d τ .$

Coherent representation:
$ρ (Z) = max_{p \in P} E_{p} [Z] .$

Distortion representation (Eq. DRMs):
$ρ (Z) = \int_{0}^{1} CVaR_{α} (Z) ν (d α), \int_{0}^{1} ν (d α) = 1.$

Time-consistent compounding (Eq. composition):
$ρ_{k, N} = Z_{k} + ρ_{k} (Z_{k + 1} + ρ_{k + 1} (Z_{k + 2} + \dots + ρ_{N - 1} (Z_{N}) \dots)) .$

“simply applying a risk metric to the sum of all costs incurred at each time-step does not generally lead to time consistency.” (Sec. 5)

Open Questions

Should subadditivity (A4) and comonotone additivity (A5) be abandoned or replaced for low-level control tasks (e.g. trajectory tracking) where the cost components cannot be performed/diversified independently? (Discussion 1)
How to resolve the tension between time consistency and interpretability, since the nested composition is harder to interpret than $ρ (\sum_{k} Z_{k})$ ? (Discussion 2)
Which particular distortion metric to choose for a given application — possibly by learning humans’ risk preferences via risk-sensitive inverse RL? (Discussion 4)
What additional, possibly domain-specific, axioms should be imposed? (Discussion 3)

Quartz 5

Explorer

majumdar2017how