Application of Conjugate Gradient Method to non-symmetric matrices

4.5k Views Asked by At

I am currently working on a problem in which I am using the Conjugate Gradient method to solve for the steady state solution of a continuous time Markov chain. I am applying the algorithm found in Stewart's text on Numerical Solution of Markov Chains. All the literature I have found has asserted that the Conjugate Gradient method only works for symmetric positive definite matrices. My generator matrices are not SPD whatsoever, but the Conjugate Gradient method is still converging to the correct solution. How?

1

There are 1 best solutions below

1
On BEST ANSWER

The classical conjugate gradient method (CG) for linear algebraic systems is equivalent to applying the Lanczos algorithm on the given matrix with the starting vector given by the (normalized) residual of the initial approximation. The coordinates of the approximate solution in the orthonormal basis generated by Lanczos are obtained by solving certain linear system with a symmetric tridiagonal matrix. The CG method actually computes an updated Cholesky factorization of this tridiagonal matrix. So, whether or not CG works for a given system depends on two factors:

  1. One should check whether or not the Lanczos algorithm applied on $A$ (the given matrix) and $r_0$ (the initial residual) generates the orthonormal basis of $\mathrm{span}\{r_0,Ar_0,A^2r_0,\ldots\}$ using the short recurrences. This is guaranteed to happen if $A$ is symmetric (Hermitian in the complex case).
  2. Next, for the CG method to work, it is required that the Cholesky factorization (or more precisely, the non-pivoted $LDL^T$) of the tridiagonal matrix generated by the Lanczos method exists. The non-existence of this factorization is related to the fact that if a symmetric/Hermitian matrix is indefinite, then the coefficients $\alpha_k$ in the CG implementation can be undefined (as $x^TAx$ may be zero for some nonzero $x$). The non-pivoted $LDL^T$ factorization, however, is guaranteed to exist if the matrix is positive definite.

Consequently, the CG method is guaranteed to work when $A$ is symmetric/Hermitian and positive definite. There are, however, also some other classes of matrices for which short recurrences generating orthonormal bases of Krylov subspaces exist; see this paper.

There are other approaches which involve the words "conjugate gradient" and work for indefinite and even non-symmetric (non-Hermitian) problems. The simplest one is to consider instead of $Ax=b$ the equivalent system of normal equations $A^TAx=A^Tb$ and applied the classical CG method on this system with positive definite $A^TA$ (if $A$ is nonsingular). Such an approach is usually discouraged though since the condition number of $A^TA$ is equal to the square of that of $A$ and hence one may expect a slower convergence of CG on the normal equation system.

One can also consider the so-called generalized conjugate gradient method, which actually is based on replacing the Lanczos algorithm by that of Arnoldi with long recurrences. This algorithm works (when properly implemented) for any nonsingular $A$ and is very closely related to the standard GMRES method.

Without further information, I cannot provide more details since I'm not aware of what "the algorithm found in Stewart's text" means.