I think I found something weird theoretical explaination from a paper named "Cooperative SGD: A Unified Framework for ...".

57 Views Asked by At

Here is the link of the paper, I hope some of you have already read this paper before, because it is cited quite a lot and focuses on the convergence analysis in distributed learning. My problem is at the Lemma 8 of the paper. Lemma 8

Let me explain the problem briefly.
Let us assume that $W$ is a symmetric doubly stochastic matrix. For example, $W=\begin{bmatrix} 0.3 & 0.7 & 0.0\\ 0.7 & 0.0 & 0.3\\ 0.0 & 0.3 & 0.7 \end{bmatrix}$.
The matrix $W$ satisfies $W\textbf{1}^T=1^T$ and $W^T=W$. This matrix is called mixing matrix in the paper. Here, $\textbf{1}$ means $[1\:1\cdots1\:1]^T$.
Let's assume matrix $J$ which is $J=\textbf{1}\textbf{1}^T/\textbf{1}^T\textbf{1}$. For example, $J=\begin{bmatrix} 1/3 & 1/3 & 1/3\\ 1/3 & 1/3 & 1/3\\ 1/3 & 1/3 & 1/3 \end{bmatrix}$.

I hope you understand the notation and definitions for the problem at Lemma 8. Now, let's see the problem.
Since $W$ is real symmetric matrix, we can $\textit{eigen-decompose}$ the matrix, such as $W=Q_a{\Lambda}{Q_a}^T$.
Also, we can $\textit{eigen-decompose}$ the matrix $J$, such as $J=Q_b{\Lambda_0}{Q_b}^T$.

Now, here is the problem.
According to the paper, they say $Q_a$ and $Q_b$ are same value $Q$. Therefore, they write $W-J=Q({\Lambda - \Lambda_0})Q$ as the equation (76) in the paper.
However, when I tried some examples (like above matrices), I could not satisfy the explaination.
I can't find any solution or proof about $Q_a=Q_b=Q$.

I tried to find any papers or lecture notes to understand the explaination. But I failed it. So, before I ask the problem to the authors, I decided to ask it first at this community.

Could you help me to understand how the $W$ and $J$ can have the same eigenvector matrix $Q$?

2

There are 2 best solutions below

0
On BEST ANSWER

Since $WJ=JW$, the two matrices can diagonalized simultaneously. Since both are real and symmetric (and commute), it is possible to find a common basis of eigenvectors. This indeed guarantees that there exists orthogonal $Q$ with $Q^TWQ$ and $Q^TJQ$ diagonal.

0
On

The idea is that the matrix $J$ is of rank $1$ and shares its largest eigenvalue and the corresponding eigenvector (namely, the vector $\frac{1}{\sqrt{m}}\mathbf{1}$) with the matrix $W$.

Since the remaining eigenvalues of $J$ are null, any base of $\mathbf{1}^\perp$ is a base for the eigenspace corresponding to $0$, meaning that any orthonormal matrix $M$ having $\frac{1}{\sqrt{m}}\mathbf{1}$ as the first column is such that $M^TJM$ is diagonal. The matrix $Q$, defined by $W = Q\Lambda Q^T$, satisfies these hypotheses, meaning that $J = Q\Lambda_0 Q^T$ is indeed an eigen-decomposition of $J$.