I have following optimization problem
\begin{equation}
\underset{B\in \mathbb{R}^{nd}}{\text{min }} \|I_n-BA^T\|_2
\end{equation}
where $d<n$, $A\in\mathbb{R}^{nd}$ and full column rank. It is equivalent to
\begin{equation}
\underset{B\in \mathbb{R}^{nd}}{\text{min }}\underset{\|x\|_2^2=1}{\text{max }}\|x-BA^Tx\|^2_2
\end{equation}
Is it on the right way to change the order of min and max?
i.e.
\begin{equation}
\underset{\|x\|_2^2=1}{\text{max }}\underset{B\in \mathbb{R}^{nd}}{\text{min }}\|x-BA^Tx\|^2_2
\end{equation}
The KKT conditions of inner optimization implies
\begin{equation}
B=xx^TA(A^Txx^TA)^{-1}
\end{equation}
If it's right then I get stuck at the outer optimization. Any help would be appreciated!
Minimization of matrix $2$-norm
391 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
Note that $xx^T$ is a rank-one matrix and thus $A^Txx^TA$ will never have an inverse. From the KKT conditions, you will get a linear equation with $B$ involved (which I guess is your source for $B$). But going beyond that requires some careful manipulations/substitutions.
If you are interested in an alternative approach for solving \begin{equation} \underset{B\in \mathbb{R}^{nd}}{\text{min }} \|I_n-BA^T\|_2 \end{equation} it is not hard to rewrite this as \begin{align} \min_{x}||A_rx-b_r||_2 \end{align} where \begin{align} x &= \mathop{vec}(B) \\ A_r &= A \otimes I \\ b_r &= \mathop{vec}(I_n) \end{align} where $vec()$ is the vec-operator, $\otimes$ is the kronecker product. This, then becomes a classical problem related to pseudo-inverse/least squares which you can look up.
On
Let $U \Sigma V^T$ be the SVD of $A$, with $\Sigma$ having the same dimensions as $A$ and hence of the form $\Sigma = \begin{bmatrix} D \\ 0 \end{bmatrix}$ where $D$ is diagonal and invertible.
$I-BA^T= I-B V \begin{bmatrix} D & 0 \end{bmatrix}U^T = U(I-U^TBV \begin{bmatrix} D & 0 \end{bmatrix})U^T$. Since the induced $2$ norm is invariant under rotations we see the problem is equivalent to $\min_C \|I -C \begin{bmatrix} D & 0 \end{bmatrix}\| $ (where $C=BV$).
If we write $C= \begin{bmatrix} C_1 \\ C_2 \end{bmatrix}$, we see that we can write $M=I -C \begin{bmatrix} D & 0 \end{bmatrix} = \begin{bmatrix} I-C_1 D & 0 \\ C_2D & I \end{bmatrix} $.
Note that $M \begin{bmatrix} 0 \\y \end{bmatrix} = \begin{bmatrix} 0 \\y \end{bmatrix} $, and so $\|M\| \ge 1$. If we choose $C= \begin{bmatrix} D^{-1} \\ 0 \end{bmatrix}$, we see that $\|M\| = 1$, hence the $\min$ norm is $1$.
Note, the above is just to justify that the $\min$ norm is one, this is easily attained with $B=0$.
Let $P=I-BA^T$. Since $\operatorname{rank}(BA^T)=d<n$, the product $BA^T$ is singular. Therefore $Px=x$ for some nonzero vector $x$. It follows that $\|P\|_2$ is always bounded below by $1$, and $P$ is nonzero.
This lower bound $1$ is attained by the trivial solution $B=0$, but there is a solution with full column rank. Let $B=A(A^TA)^{-1}$. As $A$ has full column rank, $B$ is well defined and has full column rank. Now $P=I-A(A^TA)^{-1}A^T$ is a nonzero orthogonal projection (i.e. $P^T=P=P^2$). Hence $\|P\|_2=1$.
This $B$ ($=A(A^TA)^{-1}$) is known as the Moore-Penrose pseudo-inverse of $A$.