Showing equivalence of the sum of squared differences

51 Views Asked by At

Consider a mean-centered data matrix $\textbf{X}$ and let $\mathbf{\tilde{X}} = z_1w_1^T$ be the best rank-1 approximation to $\textbf{X}$, where $z_1$ is an $n$-column vector and $w_1$ is a $p$-column vector with $w_1^Tw_1 = 1$. Now, consider the sum of the squared differences between $\textbf{X}$ and $\mathbf{\tilde{X}}$ defined as: $$d = \frac{1}{n-1}\sum_{i=1}^n\sum_{j=1}^p(\textbf{X}_{ij} - \mathbf{\tilde{X}}_{ij})^2$$ I need to show that $d$ is equal to each of the following expressions: $$\frac{1}{n-1}tr\big((\textbf{X} - \mathbf{\tilde{X}})(\textbf{X} - \mathbf{\tilde{X}})^T\big)$$ $$\frac{1}{n-1}tr(\textbf{X}^T\textbf{X}) - \lambda_1 \mathrm{,where}\: \lambda_1 \:\mathrm{is}\:\mathrm{the}\:\mathrm{largest}\:\mathrm{eigenvalue}\:\mathrm{of}\:\textbf{S}$$ $$\sum_{i=2}^p\lambda_i\mathrm{,where}\: \lambda_i, i=1,...,p\:\mathrm{are}\:\mathrm{the}\:\mathrm{eigenvalues}\:\mathrm{of}\:\textbf{S}$$ I was able to show the latter three expressions are equivalent, but I do not know how to show any one of them is equal to the sum of squared differences equation $d$. Does anyone know how to show that any of the latter three equations are equivalent to the sum of squared differences equation for $d$?

1

There are 1 best solutions below

0
On BEST ANSWER

By principle of linear algebra, $$(A-B)(A-B)^T_{ij} = \sum_k(A-B)_{ik}(A-B)^T_{kj} = \sum_k(A_{ik}-B_{ik})(A_{jk}-B_{jk})$$ Then, $$tr((A-B)(A-B)^T) = \sum_p (A-B)(A-B)^T_{pp} = \sum_p\sum_k (A_{pk}-B_{pk})^2$$ which in your case, $A = X$ and $B =\tilde{X}$