Lagrangian of matrices equation derivation

157 Views Asked by At

This is equation (41) from this paper:

A Lagrangian is constructed with a symmetric matrix multiplier $\Lambda$:

$$L(R, \Lambda)=tr(X^*R^T \tilde{X}^{*T}) - \frac{1}{2} tr(\Lambda^T(R^TR - I_k))$$

where $R$ is orthonormal ($R^TR = I_K$); $X^*$ and $\tilde{X}^{*T} \in R^{N \times K}$ and with entries only one 1 and all remaining zeros in each row (e.g. [1 0 0; 0 0 1; 0 1 0])

It then says the optimum $(R^*, \Lambda^*)$ must satisfy

$$L_R = \tilde{X}^{*T} X^* - R\Lambda = 0$$

However I couldn't get how the second equality is derived from the first Lagrangian formula?

1

There are 1 best solutions below

0
On BEST ANSWER

For matrix derivatives I like to work with the differential and variations:

The derivative of $L$ with respect to the variation $\delta R$ in $R$ is

$$dL\delta R = \operatorname{tr}\left(X^* \delta R^T \tilde{X}^{*T}\right) - \frac{1}{2}\operatorname{tr}\left(\Lambda^T (\delta R^T R + R^T \delta R)\right),$$

where we've used the linearity of trace and the product rule.

Now let's use the fact that the trace is invariant under transposing to convert all of the $\delta R^T$ to $\delta R$:

$$dL\delta R = \operatorname{tr}\left(X^* \delta R \tilde{X}^{*T}\right) - \frac{1}{2}\operatorname{tr}\left(R^T \delta R \Lambda\right) - \frac{1}{2}\operatorname{tr}\left(\Lambda^T R^T \delta R\right).$$

I'd like to rearrange all of the trace terms to look like $\operatorname{tr}(M^T \delta R)$ so that I can rewrite them as the Frobenius product $M : \delta R$. I can do this by invoking invariance of trace under cyclic permutations:

\begin{align*} dL\delta R &= \operatorname{tr}\left(\tilde{X}^{*T}X^* \delta R\right) - \frac{1}{2}\operatorname{tr}\left(\Lambda R^T \delta R\right) - \frac{1}{2}\operatorname{tr}\left(\Lambda^T R^T \delta R\right)\\ &= \left(\tilde{X}^{*T}X^* - \frac{1}{2} R\Lambda^T - \frac{1}{2}R\Lambda \right) : \delta R\\ &= \left(\tilde{X}^{*T}X^* - R\Lambda \right) : \delta R, \end{align*} where the last step uses the symmetry of $\Lambda$.