Optimize $M$ such that $MM^T$ is the "smallest" (in a positive semi-definite sense)

258 Views Asked by At

I want to solve the following problem:

\begin{alignat}{} \underset{M}{\text{minimize}} \quad & MM^T \\ \text{subject to} \quad & MF=I. \end{alignat}

where by minimize $MM^T$ I mean to find $M$ such that for any other matrix $\hat M$ that satisfies the constraint, I have $$ MM^T \le \hat M\hat M^T. $$ that is, $MM^T - \hat M\hat M^T$ is negative semi-definite. $F$ is (full rank) $n \times p$ with $n>p$ and $I$ the identity matrix.

How can I translate this "smallest" concept into the objective function of the minimization problem? Should I perhaps minimize some kind of norm of $MM^T$? And after formalizing the problem, what is its solution?

Edit 1: For context, I am working on the minimum-variance unbiased linear estimator problem.

I am given the vector $y$ and a matrix $F$ such that

$$ y = Fx + \epsilon $$ where $F \in \mathbb{R}^{m \times n}$ ($m>n$) full rank, $y \in \mathbb{R}^m$, $x \in \mathbb{R}^n$ and $\epsilon$ is a random vector with zero mean and covariance matrix $I \in \mathbb{R}^{m \times m}$ (identity matrix) and I want to estimate $x$.

I want to derive a linear estimator ($\hat x = My$) such that it is unbiased ($E[\hat x]=x$) and its covariance matrix satisfies $$ E[(\hat x - x)(\hat x - x)^T] \le E[(\tilde{x} - x)(\tilde{x} - x)^T] $$ for any unbiased linear estimate $\tilde{x}=\tilde{M}y$.

For this problem, I know that such a linear estimator is unbiased if $MF=I$ (identity matrix) and its covariance matrix is $$ E[(\hat x - x)(\hat x - x)^T] = MM^T $$ and this is the point I am stuck.

I know that the solution for this problem is $M=(F^TF)^{-1}F^T$, and all the proofs I found assume this $M$ as an estimator, and then go on to prove that it is in fact the minimum-variance unbiased one. However, I want to arrive at this result starting from first principles, as if I did not know the solution a priori.

Edit 2: For reference, this pdf also discusses the topic using this informal notion of "smallest" and "largest" covariance matrix.

Edit 3: After some numerical tests, I arrived at the conclusion that minimizing $\| M \|_2$ or $\| M \|_F$ will both lead to the desired answer, that is, $M = (F^TF)^{-1}F^T$. Could someone give a formal argument for why this is the case?

Edit 4: Some more thoughts on the minimization of $\| M \|_2$ and $\| M \|_F$.

For $\| M \|_2$, we have $$ \| M \|_2 = \sqrt{\lambda _{max} (MM^T)}. $$ Since $\sqrt{ x }$ is monotonically increasing for $x>0$, minimizing $\| M \|_2$ is equivalent to minimizing the maximum eigenvalue of $MM^T$ which, intuitively, makes sense if we want to make $MM^T$ "less" positive semi-definite.

For $\| M \|_F$, we have $$ \| M \|_F = \sqrt{\text{trace}(MM^T)} = \sqrt{ \sum_i \lambda_i(MM^T)}. $$ Using the same argument for the minimization of $\sqrt{ x }$ for $x>0$, minimizing $\| M \|_F$ is equivalent to minimizing the sum of the eigenvalues of $MM^T$ which again, intuitively, makes sense if we want to make $MM^T$ "less" positive semi-definite.

Edit 5: Turns out, this problem can be interpreted as a "vector optimization problem". See Example 4.9 of the book "Convex Optimization" by Boyd and Vandenberghe.

1

There are 1 best solutions below

0
On BEST ANSWER

In this setting the answer is trivial. Let $F=U\pmatrix{S\\ 0}V^T$ be a singular value decomposition. The constraint $MF=I$ then implies that $M$ must be in the form of $V\pmatrix{S^{-1}&X}U^T$. Hence $MM^T=V(S^{-2}+XX^T)V^T$ and its unique global minimum (with respect to the positive semidefinite partial ordering) is given by $X=0$ (because $XX^T$ is positive semidefinite and it equals zero only when $X$ is zero), meaning that $M=F^+$, the Moore-Penrose inverse of $F$.