I study the complex Schur decomposition of a complex matrix $A \in \mathbb{C}^{n \times n}$, that is: $$ A = U T U^H $$ where $T$ is upper-triangular (the eigenvalues of $A$ appear on its diagonal, and may be complex) and $U$ is unitary $U^H = U^{-1}$ (recall that $U^H = U^* = \bar{U}^T $). Such a decomposition always exists.
The standard way to compute the Schur decomposition (at least, the one used in all the sources I found, including Golub & Van Loan 2013, "Matrix Calculations" and Wikipedia) is using the QR algorithm (see Golub & Van Loan, chapter 7, or Wikipedia). Basically, one repeats the following steps:
- Set $T_0 = A$ and $U = I$ (in practice, one starts with reduction of $A$ to an Hessenberg form).
- Compute the QR factorization $T_k = Q_k R_k$.
- Form $T_{k+1} = R_k Q_K = Q_k^H Q_k R_k Q_k = Q_k^H T_k Q_k$.
- Accumulate $U = U Q_k$.
Then, in most cases (and this is an issue I need clarifications about) $T_{k+1} \to T$ converges to the upper-triangular matrix $T$ of the Schur decomposition and $U = \lim_{k \to \infty} Q_0 Q_1 \cdots Q_k$.
In this question the poster quotes a theorem that ensure convergence: let $|\lambda_1| > ... > |\lambda_n | \ge 0$ be the eigenvalues of $A$. Then $$ (T_{k})_{i,j} = O\left(\left(\dfrac{\vert \lambda_{i} \vert}{\vert \lambda_{j} \vert}\right)^k\right)$$ for $i > j$.
This means that if there are $\lambda_{i_1}$ and $\lambda_{i_2}$ such that $|\lambda_{i_1}| = |\lambda_{i_2}|$, or they are very close, $\left| \frac{|\lambda_{i_1}|}{|\lambda_{i_2}|} - 1 \right| < \mathrm{tol}$ (here tol is the computer's numerical tolerance) the matrices $T_k$ will fail to converge to $0$ in the element $(T_k)_{i_1,i_2}$ and thus won't converge to an upper-triangular matrix.
I have written a code to implement the QR algorithm for complex matrices, and encountered this phenomena: a matrix that failed to convergence to upper-triangular with eigenvalues almost the same in their absolute value.
Python's SciPy package has a special function $\texttt{scipy.linalg.schur}$ that computes the Schur decomposition even for such bad matrices. I have three questions:
- How to overcome this problem in the framework of the QR algorithm? Or is it impossible and I am forced to accept that for some matrices the QA algorithm just won't work?
- What are the necessary and/or sufficient condition for the convergence of the QR algorithm to the Schur decomposition of $A \in \mathbb{C}^{n \times n}$?
- How Python's SciPy schur function (which uses LAPACK function at the back stage) overcome this problem and compute Schur decomposition even for such bad matrices?
Related Questions: