Doctoral Research · Space Robotics Inspection with a Free-Flying Space Manipulator
A Doctoral Research Journal Aerospace Engineering

Walkthrough: \(\boldsymbol z_a\) is forced, not chosen — and it is the same object as the reconstruction

Scratch-pad companion to derivation_7dof.md §4 and dynamics_modifications_7dof.md §3.3, §5. The point of this note: tighten the story so that the inertia-weighted covector \(\boldsymbol z_a\) stops looking like a lucky choice and starts looking like the only answer — and so that M4 (the reconstruction section) and M5 (the covector) are revealed as two faces of one variational principle. Everything here is elementary linear algebra; no SVD, no pseudoinverse.


1 · The question

Section 4 of derivation_7dof.md introduces the self-motion covector as a choice:

“The normalization fixes one degree of freedom of \(\boldsymbol z_a \in \mathbb R^{13}\), leaving a twelve-parameter family of admissible covectors. Among them, take the inertia-weighted choice \(\boldsymbol z_a^T = \hat{\boldsymbol k}^T\boldsymbol M/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k})\).”

A referee reads “among them, take” and asks the obvious question: why that one? The honest answer is better than “it has nice properties.” The honest answer is that the two properties we actually need already determine \(\boldsymbol z_a\) uniquely. There is no family to choose from once the requirements are stated. Let us see that.


2 · The two requirements

We want a covector (a row vector, a linear functional on the generalized-velocity space) \(\boldsymbol z_a^T : \mathbb R^{13} \to \mathbb R\), reading \(v_n = \boldsymbol z_a^T\boldsymbol x\), that does exactly two jobs.

Requirement (i) — normalization. It reads unity on a unit of self-motion: \[ \boldsymbol z_a^T\hat{\boldsymbol k} = 1 . \] This just sets the scale of the coordinate \(v_n\), so that \(\boldsymbol x = \hat{\boldsymbol k}\) registers as \(v_n = 1\) rather than as some arbitrary number.

Requirement (ii) — dynamic consistency. The set the controller reconstructs on, \(\{v_n = 0\} = \ker \boldsymbol z_a^T\), is the \(\boldsymbol M\)-orthogonal complement of the self-motion: \[ \ker \boldsymbol z_a^T \;=\; \{\,\boldsymbol x \in \mathbb R^{13} : \hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x = 0\,\} . \] This is what “dynamically consistent” means in Khatib’s sense: the task motion carries no kinetic-energy cross-term with the self-motion (we read off the block-diagonal \(\hat{\boldsymbol M}\) from exactly this in derivation_7dof.md §4c). Equivalently, \(\{v_n=0\}\) is the horizontal subspace of the mechanical connection.

That is the whole specification. Two requirements. Now watch them collapse to a formula.


3 · The derivation (two lines)

Proposition. On \(\Omega\), the unique covector satisfying (i) and (ii) is \[ \boxed{\;\boldsymbol z_a \;=\; \frac{\boldsymbol M\hat{\boldsymbol k}}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}}\;} \]

Scratch work. Requirement (ii) is a statement about kernels. Notice that the right-hand side \(\{\boldsymbol x : \hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x = 0\}\) is itself the kernel of a covector — namely the row vector \(\hat{\boldsymbol k}^T\boldsymbol M\). So (ii) says: the covector \(\boldsymbol z_a^T\) and the covector \(\hat{\boldsymbol k}^T\boldsymbol M\) have the same kernel. And two linear functionals with the same kernel can only differ by a scalar. That is the entire idea; the rest is bookkeeping.

Proof. First, the covector \(\hat{\boldsymbol k}^T\boldsymbol M\) is nonzero: since \(\boldsymbol M\) is symmetric positive definite and \(\hat{\boldsymbol k} \neq \boldsymbol 0\), the vector \(\boldsymbol M\hat{\boldsymbol k}\) is nonzero, so its transpose is a nonzero row. Its kernel is therefore a hyperplane (dimension \(12\) in \(\mathbb R^{13}\)). By requirement (ii) the covector \(\boldsymbol z_a^T\) has that same hyperplane as its kernel.

Now invoke the elementary fact that two nonzero linear functionals on a vector space with the same kernel are proportional. (If \(\ker\phi = \ker\psi\) is a hyperplane \(H\), pick any \(\boldsymbol w \notin H\); every \(\boldsymbol x\) decomposes as \(\boldsymbol x = t\boldsymbol w + \boldsymbol h\) with \(\boldsymbol h \in H\), so \(\phi(\boldsymbol x) = t\,\phi(\boldsymbol w)\) and \(\psi(\boldsymbol x) = t\,\psi(\boldsymbol w)\); hence \(\phi = [\phi(\boldsymbol w)/\psi(\boldsymbol w)]\,\psi\).) Applying it here, there is a scalar \(c\) with \[ \boldsymbol z_a^T \;=\; c\,\hat{\boldsymbol k}^T\boldsymbol M , \qquad\text{equivalently}\qquad \boldsymbol z_a \;=\; c\,\boldsymbol M\hat{\boldsymbol k} \] (using \(\boldsymbol M = \boldsymbol M^T\)). This consumes requirement (ii) completely; the inertia weighting is not an input, it is forced.

It remains to fix \(c\), which is what requirement (i) is for. Substituting, \[ 1 \;=\; \boldsymbol z_a^T\hat{\boldsymbol k} \;=\; c\,\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k} , \qquad\text{so}\qquad c \;=\; \frac{1}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} , \] where the denominator is strictly positive on \(\Omega\) because \(\boldsymbol M\) is positive definite and \(\hat{\boldsymbol k} \neq \boldsymbol 0\) — so \(c\) is well defined everywhere we operate. Therefore \(\boldsymbol z_a = \boldsymbol M\hat{\boldsymbol k}/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k})\), and it is unique. \(\blacksquare\)

The moral. There was never a twelve-parameter family to choose from. Requirement (ii) pins the direction of \(\boldsymbol z_a\) (it must be \(\boldsymbol M\hat{\boldsymbol k}\)), and requirement (i) pins its length. The phrase “we take the inertia-weighted choice” should become “the two requirements force the inertia weighting.” This is strictly stronger and removes the only soft spot in §4.


4 · The bonus: this is also your reconstruction (M4 = M5)

Here is the part worth putting in the thesis, because it unifies two modifications that the write-ups currently treat as separate. The reconstruction problem (M4) is: given a task velocity \(\boldsymbol y \in \mathbb R^{12}\), recover a generalized velocity \(\boldsymbol x \in \mathbb R^{13}\) with \(\boldsymbol\Gamma\boldsymbol x = \boldsymbol y\). The map is wide, so “recover” needs a selection principle. Use the physical one: among all consistent \(\boldsymbol x\), take the one of least kinetic energy.

Claim. The minimum-kinetic-energy reconstruction is exactly the \(\{v_n = 0\}\) section.

Proof. We solve the constrained optimization \[ \min_{\boldsymbol x}\ \tfrac12\boldsymbol x^T\boldsymbol M\boldsymbol x \quad\text{subject to}\quad \boldsymbol\Gamma\boldsymbol x = \boldsymbol y . \] Form the Lagrangian \(\mathcal L = \tfrac12\boldsymbol x^T\boldsymbol M\boldsymbol x - \boldsymbol\lambda^T(\boldsymbol\Gamma\boldsymbol x - \boldsymbol y)\) with multiplier \(\boldsymbol\lambda \in \mathbb R^{12}\). The stationarity condition is \(\partial\mathcal L/\partial\boldsymbol x = \boldsymbol M\boldsymbol x - \boldsymbol\Gamma^T\boldsymbol\lambda = \boldsymbol 0\), so the optimizer has the form \[ \boldsymbol x^\star \;=\; \boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda . \] Imposing the constraint gives \(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda = \boldsymbol y\); the matrix \(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T\) is invertible on \(\Omega\) (it is \(12\times 12\), and positive definite since \(\boldsymbol\Gamma\) has full row rank and \(\boldsymbol M^{-1}\) is positive definite), so \(\boldsymbol\lambda = (\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T)^{-1}\boldsymbol y\) and \[ \boldsymbol x^\star \;=\; \underbrace{\boldsymbol M^{-1}\boldsymbol\Gamma^T(\boldsymbol\Gamma\boldsymbol M^{-1}\boldsymbol\Gamma^T)^{-1}}_{\textstyle \bar{\boldsymbol\Gamma}}\,\boldsymbol y . \] This \(\bar{\boldsymbol\Gamma}\) is the \(\boldsymbol M\)-weighted (dynamically consistent) generalized inverse. Now read off the self-motion content of \(\boldsymbol x^\star\). Because \(\boldsymbol x^\star = \boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda\), \[ v_n \;=\; \boldsymbol z_a^T\boldsymbol x^\star \;=\; \frac{\hat{\boldsymbol k}^T\boldsymbol M}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}}\,\boldsymbol M^{-1}\boldsymbol\Gamma^T\boldsymbol\lambda \;=\; \frac{\hat{\boldsymbol k}^T\boldsymbol\Gamma^T\boldsymbol\lambda}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} \;=\; \frac{(\boldsymbol\Gamma\hat{\boldsymbol k})^T\boldsymbol\lambda}{\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k}} \;=\; 0 , \] where the \(\boldsymbol M^{-1}\) and \(\boldsymbol M\) annihilate and the last step uses \(\boldsymbol\Gamma\hat{\boldsymbol k} = \boldsymbol 0\) (Proposition 1). So the least-kinetic-energy reconstruction automatically satisfies \(v_n = 0\): it is the section of §5.2. \(\blacksquare\)

Why lstsq injected the ghost, in one line. The production lstsq/min-norm path solves the same constrained problem but with the Euclidean objective \(\tfrac12\boldsymbol x^T\boldsymbol x\) in place of \(\tfrac12\boldsymbol x^T\boldsymbol M\boldsymbol x\). Repeating the computation with \(\boldsymbol M \to \boldsymbol E\) gives \(\boldsymbol x_E = \boldsymbol\Gamma^T(\boldsymbol\Gamma\boldsymbol\Gamma^T)^{-1}\boldsymbol y\) and the orthogonality it enforces is \(\hat{\boldsymbol k}^T\boldsymbol x_E = 0\)Euclidean-orthogonal to \(\hat{\boldsymbol k}\), not \(\boldsymbol M\)-orthogonal. But \(v_n = \hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x_E/(\hat{\boldsymbol k}^T\boldsymbol M\hat{\boldsymbol k})\), and Euclidean orthogonality does not imply \(\hat{\boldsymbol k}^T\boldsymbol M\boldsymbol x_E = 0\). The two metrics disagree by exactly the inertia weighting, and that disagreement is the non-decaying \(v_n\) we measured (mean \(|v_n| = 0.157\)). The fix was never a damping hack; it was using the right inner product.

The moral. M4 and M5 are one idea. The covector \(\boldsymbol z_a\) (M5) is the measurement of self-motion in the kinetic-energy metric; the reconstruction section (M4) is the selection of generalized velocity in the same metric. They share the matrix \(\boldsymbol M\) because they are the dual statements of one variational principle — minimum kinetic energy subject to the task constraint. Present them together and the whole 7-DOF story has a single spine.


5 · The citation, told honestly

The temptation is to attribute the inertia weighting to “Khatib eq. 18.” That is the wrong equation, and the looser attribution invites a referee to check. Here is the accurate account, from reading Khatib (1987) §VI–VII directly.

What is genuinely yours. Khatib’s \(\bar{\boldsymbol J}\) is a \(7\times 6\) matrix object he never computes — by his own summary (p. 52) the scheme “avoids the explicit evaluation of any generalized inverse or pseudo-inverse.” Two things are your own contribution:

  1. The scalar covector \(\boldsymbol z_a\) — the \(1\)-D specialization to a one-dimensional null space — derived in two lines from its defining properties (§3 above), not lifted as a quoted matrix formula. This is the right thing to show rather than cite.
  2. You compute the section explicitly and reconstruct on it every step, where Khatib uses \(\bar{\boldsymbol J}\) only as a derivation device. Same object, opposite implementation stance — worth a sentence if a referee asks why your treatment looks different from his.

6 · Edits this note implies

If you agree with the above, the concrete changes are small and local:

  1. derivation_7dof.md §4 — replace “Among them, take the inertia-weighted choice …” with the two-requirement Proposition of §3 (it is shorter than the current three-property justification and strictly stronger: uniqueness, not preference). Keep §4(a)–(c) as the consequences of the now-forced \(\boldsymbol z_a\).
  2. dynamics_modifications_7dof.md §3.3 — the line “This inertia weighting is the same dynamically-consistent device as Giordano’s thesis equation (5.22)” is fine, but add the one-sentence variational characterization (§4 above) so M4 and M5 read as one principle; and correct the Khatib pointer to eqs. 51–52.
  3. §5.1 of dynamics_modifications already says “the min-norm convention was itself the ghost injector” — strengthen it with the one-line metric argument from §4 (Euclidean vs \(\boldsymbol M\) orthogonality), which makes the claim a proof rather than an assertion.
  4. References note — add the eq-52 pointer, Featherstone–Khatib 1997 for the term, and the 1980/1983 origin.

None of this touches deriv_7dof.tex.