How to prove the correctness of one way of computing $\sum_{i = 1}^N \sum_{j = 1}^M \langle U_i, V_j \rangle^2$

42 Views Asked by At

Here $U$ and $V$ are two matrices of shape $N\times P$ and $M \times P$ respectively. $U_i$ and $V_j$ are i-th and j-th row vector of U and V. Denote element-wise multiplication as $\odot$, and inner product as $\langle \cdot, \cdot \rangle $, then

$$\sum_{i = 1}^N \sum_{j = 1}^M \langle U_i, V_j \rangle^2 = \sum_{\text{all elements}} (U^\dagger U) \odot (V^\dagger V) $$

How to prove the above?

Question Background: the left hand side is called the gravity loss in matrix factorization, a machine learning algorithm for recommendation systems, and the right hand side is one implementation I have seen, but I cannot fathom how they are equivalent. Although it arises in a ML problem, I believe this question is a pure math one and suits this site.

1

There are 1 best solutions below

0
On BEST ANSWER

Denote the $k$-th column of $U$ by $u_k$. Then $$ (U^TU)_{k\ell} = \langle u_k,u_\ell\rangle = \sum_{i=1}^Nu_{ik}u_{i\ell}. $$ Similarly for $V^TV$. Now, \begin{align*} \sum_{i=1}^N\sum_{j=1}^M\langle U_i,V_j\rangle^2 &= \sum_{i=1}^N\sum_{j=1}^M\left(\sum_{k=1}^Pu_{ik}v_{jk}\right)^2\\ &= \sum_{i=1}^N\sum_{j=1}^M\sum_{k=1}^P\sum_{\ell=1}^Pu_{ik}u_{i\ell}v_{jk}v_{j\ell}\\ &= \sum_{k=1}^P\sum_{\ell=1}^P\left(\sum_{i=1}^Nu_{ik}u_{i\ell}\right)\left(\sum_{j=1}^Mv_{jk}v_{j\ell}\right)\\ &= \sum_{k=1}^P\sum_{\ell=1}^P(U^TU)_{k\ell}(V^TV)_{k\ell}. \end{align*}