Diagonalization of matrix and how to use it

65 Views Asked by At

I got this diagonalizable matrix $ H = Q\Lambda Q^T$ which is actually a Hessian of the function used in finding the optimal parameter $\theta$ ($\mu$ is regularization parameter and it is a scalar, if it matters, know that this equation came from first order optimality condition $H(\theta - \theta^*) + \mu\theta = 0$): $$\theta = (H + \mu I)^{-1}H\theta^*$$ What I don't understand here is how they come from using property of the diagonalizable matrix H to the next result $$\theta = (H + \mu I)^{-1}H\theta^* = Q(\Lambda + \mu I)^{-1}\Lambda Q^T \theta^*$$ I understand they start like this : $$\theta = (H + \mu I)^{-1}H\theta^* = (Q\Lambda Q^T + \mu I)^{-1}Q\Lambda Q^T \theta^*$$ But from here I don't understand how they reduce it to last term in the equation goal I showed one equation above.

1

There are 1 best solutions below

3
On BEST ANSWER

Since $H$ is a Hessian, it is symmetric, so that $Q$ is orthogonal and $QQ^T=I$.

Because they have the same inverse, these matrices are equal:

$$\left[Q(\Lambda + \mu I )Q^T\right]^{-1} = Q(\Lambda + \mu I )^{-1}Q^T $$

So, starting with: $$\theta = (H + \mu I)^{-1}H\theta^*$$ we have $$\begin{aligned} \theta &= (Q\Lambda Q^T + \mu QQ^T)^{-1}H\theta^*\\ &=\left[Q(\Lambda + \mu I )Q^T\right]^{-1}H\theta^*\\ &= Q(\Lambda + \mu I)^{-1} Q^TH\theta^*\\ &= Q(\Lambda + \mu I)^{-1} Q^T Q\Lambda Q^T\theta^*\\ &= Q(\Lambda + \mu I)^{-1} \Lambda Q^T\theta^*.\\ \end{aligned} $$