Given the ellipsoid
$ (r - r_0)^T Q (r - r_0) = 1$
And a point $A$ anywhere in space, I want to find the closest point $r^*$ on the ellipsoid to $A$.
My attempt:
Define the scalar function:
$F(r) = (r - r_0)^T Q (r - r_0) - 1 $
The gradient of this function at any point $r$ is
$ \nabla F = 2 Q (r - r_0) $
The above gradient vector is perpendicular to the surface of the ellipsoid at point $r$.
Therefore, the closest point $r^*$ on the ellipsoid to point $A$ satisfies
$ (r^* - A) = K Q (r^* - r_0) $
For some constant $K$.
Solving these equations is discussed in my solution below.
Comments, and hints, and alternative solutions, are appreciated.
You would like to solve the optimization problem: $$ \begin{aligned} \|r-A\|^2 \to \min! \qquad \text{s.t.}\qquad (r-r_0)^T Q (r-r_0) = 1. \end{aligned} $$ The KKT conditions for this problem amount to $$ \begin{aligned} r-A = KQ(r-r_0), \qquad (r-r_0)^T Q (r-r_0) = 1. \end{aligned} $$ precisely as you write, where I denoted the Lagrange multiplier associated with the equality constraint by $K$ to be consistent with your notation. Since $Q$ is symmetric and positive definite (as you have an ellipsoid) it admits a spectral decomposition $V\Lambda V^T$, where $V$ is an orthogonal matrix of eigenvectors of $Q$, and $\Lambda$ is a diagonal matrix of $Q$'s eigenvalues (which are positive). Let us substitute this into the first KKT condition: $$ r-A = K V\Lambda V^T (r-r_0). $$ Let us denote the coordinates of $r-r_0$ in the basis of $V$ by $x$, and of $A-r_0$ by $y$, i.e. $$ r-r_0 = VV^T(r-r_0) =: Vx, \qquad A-r_0 = VV^T(A-r_0) =: Vy. $$ We can reshuffle our KKT condition a little: $$ (r-r_0)-(A-r_0) = K V\Lambda V^T (r-r_0), $$ or $$ Vx - KV\Lambda x = Vy, $$ or $$ (I-K\Lambda)x = y, $$ or, component-wise, assuming $\Lambda = \text{diag}(\lambda_i)$ $$ (1-K\lambda_i)x_i = y_i. $$ In particular, if $y_i = 0$ we can e.g. set $K=\lambda_i^{-1}$ and get the equality regardless of $x_i$. If, on the other hand, $y_i \neq 0$ then we cannot put $K=\lambda_i^{-1}$. Thus we obtain the following candidate solutions for $x$, from which $r=r_0+Vx$ is determined.
Case 1. Assuming $K\neq \lambda_i^{-1}$, we get $x_i = [1-K\lambda_i]^{-1}y_i$, and ultimately to a the following scalar equation for $K$ from the second KKT condition: $$ \sum_{i} \frac{\lambda_i}{[1-K\lambda_i]^2}y_i^2 = 1. $$ Note that this equation is unsolvable, when $y=0$.
Case 2. Assume that $\lambda_i = \hat{\lambda}$ for $i\in \hat{I}$, and $y_i = 0$ for all $i\in \hat{I}$. Then then we can set $K=\hat{\lambda}^{-1}$. In this case, we compute other $x$-coordinates as before, i.e., $x_i = [1-K\lambda_i]^{-1}y_i = [1-\hat{\lambda}^{-1}\lambda_i]^{-1}y_i$ for $i\not\in \hat{I}$, while $x_i$ for $i\in \hat{I}$ are undertermined from the first KKT condition. We end up with the equation: $$ \sum_{i\in \hat{I}} \hat{\lambda} x_i^2 + \sum_{i\not\in \hat{I}} \frac{\lambda_i}{[1-\hat{\lambda}^{-1}\lambda_i]^2}y_i^2 = 1, $$ which, when solvable, gives us at least two, or when multiplicity of $\hat{\lambda}$ is more then one, infinitely many candidate solutions $x_i$, $i\in \hat{I}$.
Example 1: Assume we work in dimension $2$, $\Lambda = \text{diag}[1,2]$, $V=I$, $r_0 = [0,0]$, $A=[1/2,1/4]$. In this example $y=A$, where none of its components are $0$. Thus we arrive at the following equation for $K$: $$\frac{1}{4[1-K]^2}+\frac{2}{16[1-2K]^2} = 1.$$ This equation happens to have two solutions, $K_1\approx 0.260161052873378$ and $K_2\approx 1.50787304726301$, see desmos plot here. This gives us two KKT points: $x_1 \approx [0.73166349, 0.681666]$ and $x_2 \approx [-0.984498, -0.12402356]$. The situation is described in this desmos plot.
Example 2: Assume we work in dimension $2$, $\Lambda = \text{diag}[1,2]$, $V=I$, $r_0 = [0,0]$, $A=[1/2,0]$. In this example $y=A$, where $y_2 = 0$. In case 1 we have the the following equation for $K$: $$\frac{1}{4[1-K]^2} = 1.$$ This equation happens to have two solutions, $K_1=1.5$ and $K_2=0.5$, see desmos plot here. Note that $K_2 = \lambda_2^{-1}$, so we will not use it for case 1. Thus case 1 results in 1 KKT point: $x_1 =[-1,0]$.
For case 2 we put $K_2 = \lambda_2^{-1} = 0.5$. We compute $x_{2,1} = y_1/[1-K_2\lambda_1] = 1.0$, and $x_{2,2}$ from the equation $\lambda_1 x_{2,1}^2 + \lambda_2 x_{2,2}^2 = 1$. This gives us a double root $x_{2,2}=0.0$, shown here.
Example 3: Assume we work in dimension $2$, $\Lambda = \text{diag}[1,2]$, $V=I$, $r_0 = [0,0]$, $A=[1/3,0]$. In this example $y=A$, where $y_2 = 0$. In case 1 we have the the following equation for $K$: $$\frac{1}{9[1-K]^2} = 1.$$ This equation happens to have two solutions, $K_1=2/3$ and $K_2=4/3$, see desmos plot here. Thus case 1 results in 2 KKT point: $x_1 =[1,0]$ and $x_2 = [-1,0]$.
For case 2 we put $K_2 = \lambda_2^{-1} = 0.5$. We compute $x_{3,1} =x_{4,1} = y_1/[1-K_2\lambda_1] = 2/3$. We then compute $x_{3,2}$ and $x_{4,2}$ from the equation $\lambda_1 x_{3,1}^2 + \lambda_2 x_{3,2}^2 = 1$. This gives us two roots $x_{3,2}=\sqrt{[1-4/9]/2}\approx 0.5270462766947299$, and $x_{4,2}\approx -0.5270462766947299$ shown here. So in this case we have 4 KKT points.
I hope this illustrates the computations.
Thanks @Chris Lewis.