Guarantee symmetric, positive definite matrix within cost function

208 Views Asked by At

Let's say I want to calculate the positive definite, symmetric matrix $\Sigma$ by optimizing the cost function $f$ with

$$ min(f(\Sigma)) $$

In a optimization regular solver I would define the elements of the upper triangle of $\Sigma$ as the variables that are supposed to be optimized in order to get a symmetric $\Sigma$. However, additionally to symmetry, I also want it to be positive definite. This is difficult to formulate, as many solvers allow a boundary for the optimization variables, but not any arbitrary feasibility function.

So I have the idea to not optimize the elements of $\Sigma$ directly, but to consider

$$ \Sigma = Q \cdot \Lambda \cdot Q^T $$

where $Q$ is the matrix of eigenvectors and $\Lambda$ is the diagonal matrix of eigenvalues of $\Sigma$ (https://en.wikipedia.org/wiki/Eigendecomposition_of_a_matrix#Real_symmetric_matrices) and to optimize the non-zero elements in $Q$ and $\Lambda$.

Now I can place the formulate for the solver that the elements in $\Lambda$ are supposed to be larger than 0. From my understanding gathered from the Cholesky decomposition, this fact that I am using $Q$ with $Q^T$ already guarantees symmetry, so I do not further need to consider this. Using the above formula I can replace each occurrence of $\Sigma$ in $f$ with $Q \cdot \Lambda \cdot Q^T$.

Only caveat is that I now have more variables that need to be optimized than before because if $\Sigma$ has the size of $n \times n$, I previously had $(n^2+n)/2$ optimization variables, but now I have $n^2 + n$ optimization variables.

I tried to find out as much as possible before asking and get the impression I already solved all my questions myself - however I am not sure whether I did a mistake in my reasoning, so my question left is: Am I right or did I make a mistake somewhere?

1

There are 1 best solutions below

0
On

When you do the eigendecomposition of a PSD matrix, $Q$ is an orthonormal matrix (remember that the columns of $Q$ are real and orthonormal eigenvectors of $\Sigma$). This orthonormality condition ties the selection of the columns of $Q$ together and avoids solving each individual column separately. Therefore, $Q$ does not have $n$ independent variables to find with optimization. In more detail, if you fix the first $i-1$ columns of $Q$, you can select any $n-i$ elements of the $i$'th column as free variables. This leaves you with $n(n-1)/2$ free elements in $Q$ and $n$ positive eigenvalues to choose, which is the same as $n(n+1)/2$ independent elements you had before.