Scratch-pad companion to derivation_7dof.md §4 and
dynamics_modifications_7dof.md §3.3, §5. The point of this
note: tighten the story so that the inertia-weighted covector \(\boldsymbol z_a\) stops looking like a
lucky choice and starts looking like the only answer — and so that M4
(the reconstruction section) and M5 (the covector) are revealed as two
faces of one variational principle. Everything here is elementary linear
algebra; no SVD, no pseudoinverse.
Section 4 of derivation_7dof.md introduces the
self-motion covector as a choice:
“The normalization fixes one degree of freedom of \(\boldsymbol z_a \in \mathbb R^{13}\), leaving a twelve-parameter family of admissible covectors. Among them, take the inertia-weighted choice \(\boldsymbol z_a^T = \hat{\boldsymbol k}^T\boldsymbol M/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k})\).”
A referee reads “among them, take” and asks the obvious question: why that one? The honest answer is better than “it has nice properties.” The honest answer is that the two properties we actually need already determine \(\boldsymbol z_a\) uniquely. There is no family to choose from once the requirements are stated. Let us see that.
We want a covector (a row vector, a linear functional on the generalized-velocity space) \(\boldsymbol z_a^T : \mathbb R^{13} \to \mathbb R\), reading \(v_n = \boldsymbol z_a^T\boldsymbol x\), that does exactly two jobs.
Requirement (i) — normalization. It reads unity on a unit of self-motion: \[ \boldsymbol z_a^T\hat{\boldsymbol k} = 1 . \] This just sets the scale of the coordinate \(v_n\), so that \(\boldsymbol x = \hat{\boldsymbol k}\) registers as \(v_n = 1\) rather than as some arbitrary number.
Requirement (ii) — dynamic consistency. The set the
controller reconstructs on, \(\{v_n = 0\} =
\ker \boldsymbol z_a^T\), is the \(\boldsymbol M\)-orthogonal complement of
the self-motion: \[
\ker \boldsymbol z_a^T \;=\; \{\,\boldsymbol x \in \mathbb R^{13} :
\hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x = 0\,\} .
\] This is what “dynamically consistent” means in
Khatib’s sense: the task motion carries no kinetic-energy cross-term
with the self-motion (we read off the block-diagonal \(\hat{\boldsymbol M}\) from exactly this in
derivation_7dof.md §4c). Equivalently, \(\{v_n=0\}\) is the horizontal subspace of
the mechanical connection.
That is the whole specification. Two requirements. Now watch them collapse to a formula.
Proposition. On \(\Omega\), the unique covector satisfying (i) and (ii) is \[ \boxed{\;\boldsymbol z_a \;=\; \frac{\boldsymbol M\hat{\boldsymbol k}}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}}\;} \]
Scratch work. Requirement (ii) is a statement about kernels. Notice that the right-hand side \(\{\boldsymbol x : \hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x = 0\}\) is itself the kernel of a covector — namely the row vector \(\hat{\boldsymbol k}^T\boldsymbol M\). So (ii) says: the covector \(\boldsymbol z_a^T\) and the covector \(\hat{\boldsymbol k}^T\boldsymbol M\) have the same kernel. And two linear functionals with the same kernel can only differ by a scalar. That is the entire idea; the rest is bookkeeping.
Proof. First, the covector \(\hat{\boldsymbol k}^T\boldsymbol M\) is nonzero: since \(\boldsymbol M\) is symmetric positive definite and \(\hat{\boldsymbol k} \neq \boldsymbol 0\), the vector \(\boldsymbol M\hat{\boldsymbol k}\) is nonzero, so its transpose is a nonzero row. Its kernel is therefore a hyperplane (dimension \(12\) in \(\mathbb R^{13}\)). By requirement (ii) the covector \(\boldsymbol z_a^T\) has that same hyperplane as its kernel.
Now invoke the elementary fact that two nonzero linear functionals on a vector space with the same kernel are proportional. (If \(\ker\phi = \ker\psi\) is a hyperplane \(H\), pick any \(\boldsymbol w \notin H\); every \(\boldsymbol x\) decomposes as \(\boldsymbol x = t\boldsymbol w + \boldsymbol h\) with \(\boldsymbol h \in H\), so \(\phi(\boldsymbol x) = t\,\phi(\boldsymbol w)\) and \(\psi(\boldsymbol x) = t\,\psi(\boldsymbol w)\); hence \(\phi = [\phi(\boldsymbol w)/\psi(\boldsymbol w)]\,\psi\).) Applying it here, there is a scalar \(c\) with \[ \boldsymbol z_a^T \;=\; c\,\hat{\boldsymbol k}^T\boldsymbol M , \qquad\text{equivalently}\qquad \boldsymbol z_a \;=\; c\,\boldsymbol M\hat{\boldsymbol k} \] (using \(\boldsymbol M = \boldsymbol M^T\)). This consumes requirement (ii) completely; the inertia weighting is not an input, it is forced.
It remains to fix \(c\), which is what requirement (i) is for. Substituting, \[ 1 \;=\; \boldsymbol z_a^T\hat{\boldsymbol k} \;=\; c\,\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k} , \qquad\text{so}\qquad c \;=\; \frac{1}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} , \] where the denominator is strictly positive on \(\Omega\) because \(\boldsymbol M\) is positive definite and \(\hat{\boldsymbol k} \neq \boldsymbol 0\) — so \(c\) is well defined everywhere we operate. Therefore \(\boldsymbol z_a = \boldsymbol M\hat{\boldsymbol k}/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k})\), and it is unique. \(\blacksquare\)
The moral. There was never a twelve-parameter family to choose from. Requirement (ii) pins the direction of \(\boldsymbol z_a\) (it must be \(\boldsymbol M\hat{\boldsymbol k}\)), and requirement (i) pins its length. The phrase “we take the inertia-weighted choice” should become “the two requirements force the inertia weighting.” This is strictly stronger and removes the only soft spot in §4.
Here is the part worth putting in the thesis, because it unifies two modifications that the write-ups currently treat as separate. The reconstruction problem (M4) is: given a task velocity \(\boldsymbol y \in \mathbb R^{12}\), recover a generalized velocity \(\boldsymbol x \in \mathbb R^{13}\) with \(\boldsymbol\Gamma\boldsymbol x = \boldsymbol y\). The map is wide, so “recover” needs a selection principle. Use the physical one: among all consistent \(\boldsymbol x\), take the one of least kinetic energy.
Claim. The minimum-kinetic-energy reconstruction is exactly the \(\{v_n = 0\}\) section.
Proof. We solve the constrained optimization \[ \min_{\boldsymbol x}\ \tfrac12\boldsymbol x^T\boldsymbol M\boldsymbol x \quad\text{subject to}\quad \boldsymbol\Gamma\boldsymbol x = \boldsymbol y . \] Form the Lagrangian \(\mathcal L = \tfrac12\boldsymbol x^T\boldsymbol M\boldsymbol x - \boldsymbol\lambda^T(\boldsymbol\Gamma\boldsymbol x - \boldsymbol y)\) with multiplier \(\boldsymbol\lambda \in \mathbb R^{12}\). The stationarity condition is \(\partial\mathcal L/\partial\boldsymbol x = \boldsymbol M\boldsymbol x - \boldsymbol\Gamma^T\boldsymbol\lambda = \boldsymbol 0\), so the optimizer has the form \[ \boldsymbol x^\star \;=\; \boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda . \] Imposing the constraint gives \(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda = \boldsymbol y\); the matrix \(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T\) is invertible on \(\Omega\) (it is \(12\times 12\), and positive definite since \(\boldsymbol\Gamma\) has full row rank and \(\boldsymbol M^{-1}\) is positive definite), so \(\boldsymbol\lambda = (\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T)^{-1}\boldsymbol y\) and \[ \boldsymbol x^\star \;=\; \underbrace{\boldsymbol M^{-1}\boldsymbol\Gamma^T(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T)^{-1}}_{\textstyle \bar{\boldsymbol\Gamma}}\,\boldsymbol y . \] This \(\bar{\boldsymbol\Gamma}\) is the \(\boldsymbol M\)-weighted (dynamically consistent) generalized inverse. Now read off the self-motion content of \(\boldsymbol x^\star\). Because \(\boldsymbol x^\star = \boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda\), \[ v_n \;=\; \boldsymbol z_a^T\boldsymbol x^\star \;=\; \frac{\hat{\boldsymbol k}^T\boldsymbol M}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}}\,\boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda \;=\; \frac{\hat{\boldsymbol k}^T\boldsymbol\Gamma^T\boldsymbol\lambda}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} \;=\; \frac{(\boldsymbol\Gamma\hat{\boldsymbol k})^T\boldsymbol\lambda}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} \;=\; 0 , \] where the \(\boldsymbol M^{-1}\) and \(\boldsymbol M\) annihilate and the last step uses \(\boldsymbol\Gamma\hat{\boldsymbol k} = \boldsymbol 0\) (Proposition 1). So the least-kinetic-energy reconstruction automatically satisfies \(v_n = 0\): it is the section of §5.2. \(\blacksquare\)
Why lstsq injected the ghost, in one
line. The production lstsq/min-norm path solves
the same constrained problem but with the Euclidean objective
\(\tfrac12\boldsymbol x^T\boldsymbol
x\) in place of \(\tfrac12\boldsymbol
x^T\boldsymbol M\boldsymbol x\). Repeating the computation with
\(\boldsymbol M \to \boldsymbol E\)
gives \(\boldsymbol x_E =
\boldsymbol\Gamma^T(\boldsymbol\Gamma\boldsymbol\Gamma^T)^{-1}\boldsymbol
y\) and the orthogonality it enforces is \(\hat{\boldsymbol k}^T\boldsymbol x_E = 0\)
— Euclidean-orthogonal to \(\hat{\boldsymbol k}\), not \(\boldsymbol M\)-orthogonal. But \(v_n = \hat{\boldsymbol k}^T\boldsymbol
M\boldsymbol x_E/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol
k})\), and Euclidean orthogonality does not imply \(\hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x_E
= 0\). The two metrics disagree by exactly the inertia weighting,
and that disagreement is the non-decaying \(v_n\) we measured (mean \(|v_n| = 0.157\)). The fix was never a
damping hack; it was using the right inner product.
The moral. M4 and M5 are one idea. The covector \(\boldsymbol z_a\) (M5) is the measurement of self-motion in the kinetic-energy metric; the reconstruction section (M4) is the selection of generalized velocity in the same metric. They share the matrix \(\boldsymbol M\) because they are the dual statements of one variational principle — minimum kinetic energy subject to the task constraint. Present them together and the whole 7-DOF story has a single spine.
The temptation is to attribute the inertia weighting to “Khatib eq. 18.” That is the wrong equation, and the looser attribution invites a referee to check. Here is the accurate account, from reading Khatib (1987) §VI–VII directly.
What is genuinely yours. Khatib’s \(\bar{\boldsymbol J}\) is a \(7\times 6\) matrix object he never computes — by his own summary (p. 52) the scheme “avoids the explicit evaluation of any generalized inverse or pseudo-inverse.” Two things are your own contribution:
If you agree with the above, the concrete changes are small and local:
derivation_7dof.md §4 — replace “Among
them, take the inertia-weighted choice …” with the two-requirement
Proposition of §3 (it is shorter than the current three-property
justification and strictly stronger: uniqueness, not preference). Keep
§4(a)–(c) as the consequences of the now-forced \(\boldsymbol z_a\).dynamics_modifications_7dof.md §3.3 —
the line “This inertia weighting is the same dynamically-consistent
device as Giordano’s thesis equation (5.22)” is fine, but add the
one-sentence variational characterization (§4 above) so M4 and M5 read
as one principle; and correct the Khatib pointer to eqs. 51–52.dynamics_modifications already
says “the min-norm convention was itself the ghost injector” —
strengthen it with the one-line metric argument from §4 (Euclidean vs
\(\boldsymbol M\) orthogonality), which
makes the claim a proof rather than an assertion.None of this touches deriv_7dof.tex.