How does one derive the minimal enclosing ellipsoid for another half-ellipsoid?

74 Views Asked by At

Say we cut the $n$-dimensional hypersphere into two hemispheres along the hyperplane perpendicular to a unit vector $\mathbf{a}$. It can be shown that the smallest ellipsoid that encloses the lower of these hemispheres $$ \{\mathbf{x} \in \text{hypersphere} : \mathbf{x}^T\mathbf{a} \leq 0\} $$ is given by $$ \{\mathbf{x} \in \mathbb{R}^n : (\mathbf{x-r})^T\mathbf{Q}(\mathbf{x-r})\leq 1\} $$ where $\mathbf{Q}$ is a positive definite matrix $$ \mathbf{Q} = (\lambda_1 - \lambda_2)\mathbf{aa}^T + \lambda_2\mathbf{I} $$ $$ \mathbf{r} = - \frac{\mathbf{a}}{n+1} $$ and $\lambda_1$, $\lambda_2$ are constants that only depend on $n$.

The next task is to generalize this to the case where we're not cutting the hypersphere but another ellipsoid. My argument so far is this:

Any positive definite matrix can be decomposed into $\mathbf{Q}=\mathbf{M}^T\mathbf{M}$. We can then map the unit sphere to the ellipsoid using $\mathbf{M}^{-1}$ and the ellipsoid to the sphere using $\mathbf{M}$. We then map the direction vector $\mathbf{a}$ to the unit sphere and use the previous result, giving us the matrix $$ (\lambda_1 - \lambda_2)\mathbf{Maa}^T\mathbf{M}^T + \lambda_2\mathbf{I}\,. $$ However, $\mathbf{Ma}$ may no longer be a unit vector. Correcting for this leads to $$ \begin{split} & (\lambda_1 - \lambda_2)\frac{\mathbf{Maa}^T\mathbf{M}^T}{\sqrt{\mathbf{a}^T\mathbf{M}^T\mathbf{Ma}}\sqrt{\mathbf{a}^T\mathbf{M}^T\mathbf{Ma}}} + \lambda_2\mathbf{I}\\ = & (\lambda_1 - \lambda_2)\frac{\mathbf{Maa}^T\mathbf{M}^T}{\mathbf{a}^T\mathbf{Qa}} + \lambda_2\mathbf{I}\,. \end{split} $$ Finally, since this is only an enclosing ellipsoid for the hypersphere, we have to map this back to the original ellipsoid. The unit sphere can be viewed as the ellipsoid defined by the identity matrix. This is transformed to the ellipsoid defined by $\mathbf{Q}$ by multiplying the identity by $\mathbf{M}^T$ on the left and $\mathbf{M}$ on the right. We can try the same thing here, yielding $$ \begin{split} & \mathbf{M}^T\left((\lambda_1 - \lambda_2)\frac{\mathbf{Maa}^T\mathbf{M}^T}{\mathbf{a}^T\mathbf{Qa}} + \lambda_2\mathbf{I}\right)\mathbf{M}\\ = & (\lambda_1 - \lambda_2)\frac{\mathbf{Q}\mathbf{aa}^T\mathbf{Q}}{\mathbf{a}^T\mathbf{Qa}} + \lambda_2\mathbf{Q}\,. \end{split} $$

Unfortunately, this appears to be wrong. The answer expected in the text is $$ (\lambda_1-\lambda_2)\frac{\mathbf{aa}^T}{\mathbf{aQ}^{-1}\mathbf{a}} + \lambda_2\mathbf{Q} $$

These answers feel quite close to another. If one imagines $\mathbf{Q}$ to be a scalar then one could get from my expression to the expected one by multiplying both the top and bottom of the fraction by $\mathbf{Q}^{-2}$. However, numerical evidence shows that these expressions are not equal in general and I've found examples where the given answer is correct, while mine isn't. A similar derailment takes place with the expression for $\mathbf{r}$. Yet finding the flaw in the argument has proven rather tough.

1

There are 1 best solutions below

0
On BEST ANSWER

The normal of the transformed hyperplane is not $\mathbf M\mathbf a$ but $\mathbf M^{-T}\mathbf a$. This is because if $\mathbf x$ is a point on the original hyperplane and $\mathbf x' = \mathbf M\mathbf x$ is the corresponding point on the transformed hyperplane, then $\mathbf x^T\mathbf a = 0 \iff (\mathbf M^{-1}\mathbf x')^T\mathbf a = 0 \iff (\mathbf x')^T(\mathbf M^{-T}\mathbf a) = 0$.

If you use the correct transformation for the normal, $\mathbf a' = \mathbf M^{-T}\mathbf a$, then you get \begin{align} \mathbf Q' &= \mathbf M^T\left((\lambda_1 - \lambda_2)\frac{\mathbf a'(\mathbf a')^T}{(\mathbf a')^T\mathbf a'} + \lambda_2\mathbf I\right)\mathbf M \\ &= \mathbf M^T\left((\lambda_1 - \lambda_2)\frac{\mathbf M^{-T}\mathbf a\mathbf a^T\mathbf M^{-1}}{\mathbf a^T\mathbf M^{-1}\mathbf M^{-T}\mathbf a} + \lambda_2\mathbf I\right)\mathbf M \\ &= (\lambda_1 - \lambda_2)\frac{\mathbf a\mathbf a^T}{\mathbf a^T(\mathbf M^T\mathbf M)^{-1}\mathbf a} + \lambda_2\mathbf M^T\mathbf M \\ &= (\lambda_1 - \lambda_2)\frac{\mathbf a\mathbf a^T}{\mathbf a^T\mathbf Q^{-1}\mathbf a} + \lambda_2\mathbf Q \end{align} as desired.