@Victorliu specified in a comment for this question: "Block diagonalizing two matrices simultaneously is equivalent to finding invariant subspaces common to both matrices". There are two questions regarding this question:
- I am curious and want to know more in details about this equivalency. Is there any reference about this theorem/equivalency?
- There is a solution for this question by @JoonasIlmavirta, however, it is not that clear for finding such invariant subspaces. Would you please give more elaboration or give any reference about that.
- I am interested in reading papers (or any reference) about block diagonalization of a matrix in details. It would be great if you can suggest any reference.
Thanks.
I don't know of a reference, but here is a way to see the equivalence. When you write a matrix in block form, you decompose the underlying space(s) as a direct sum. You can also see writing a matrix in a given basis this way; the summands are one-dimensional, corresponding to the chosen basis vectors.
If the domain is $\mathbb R^n=D_1\oplus D_2\oplus\dots\oplus D_k$ and the target is $\mathbb R^l=T_1\oplus\dots\oplus T_m$ (as direct sums of linearly independent subspaces), then an $l\times n$ matrix $A$ can be written as a $m\times k$ block matrix. Since zero-dimensional spaces $D_i$ or $T_i$ should be excluded (they're just silly), we have $l\geq m$ and $n\geq k$ but there are no other constraints.
It makes sense to say that the matrix is $A$ block-diagonal if $k=m$ and all off-diagonal blocks are zero. Observe that $k=m$ does not imply $n=l$. Considering $A$ as a mapping $\mathbb R^n\to\mathbb R^l$, this is the same as requiring that $A(D_i)\subset T_i$ for all $i$. If you had $A(D_i)\not\subset T_i$, then $A(D_i)\cap T_j$ would be non-zero for some $i\neq j$, meaning that the block at $(j,i)$ is non-zero.
If you have a square matrix, it is often most convenient to choose $D_i=T_i$ for every $i$, and this is what is typically meant by a block matrix. (Observe that if you change the basis of a matrix, you apply the same change on both sides of the matrix.) In this block structure block-diagonality means that $A(D_i)\subset D_i$, which means that the space $D_i$ is an invariant subspace for $A$. That is, block-diagonialization amounts to finding subspaces $D_i$ so that the original space is the direct sum of them and $A(D_i)\subset D_i$ for all $i$.
If you have two matrices that are simultaneously block-diagonal, they both have to satisfy the block-diagonal assumption in the same basis. (Simultaneity means precisely that the same basis works for both.) That is, two $n\times n$ matrices $A$ and $B$ are simultaneously block-diagonalized by the subspaces $D_1\oplus\dots D_k$ if and only if all the spaces are invariant for both: $A(D_i)\subset D_i$ and $B(D_i)\subset D_i$ for all $i$.