When reading about Conjugate Gradient Methods, one stumbles across Preconditioning. Usually it reads, the Preconditioned Problem:
$$L^{-1}AL^{-T} x = -L^{-1}b$$
where practically, $LL^T \approx A$, converges faster than the non-preconditioned Problem
$$ Ax = -b$$
applying CG. Then you try and read some more and find out, that it has something to do with a conditioning number:
$$\kappa = \frac{|\lambda_{max}(A)|}{|\lambda_{min}(A)|} $$
and if $\kappa$ is close to $1$, CG converges faster. But Why?
The closest I've gotten is Nocedal, Wright: Numerical Optimization (2006), pp. 112-118: Rate of Convergence, which does not provide a proof, and as a non-mathematician is not very understandable. They basically state, that CG converges faster with clustered eigenvalues of $A$, which would go well with the condition number-idea, i.e. if $\lambda_{max} \approx \lambda_{min}$ are clustered, $\kappa$ is close to $1$.
Any tip and/or tip on literature would be appreciated. Thanks for the help.
Things that are helpful, but somewhat not satisfying:
- Spectral Analysis (Krylov sequence): Boyd's Lecture on CG, and thank you @LinAlg Vandenberghe's lecture on CG (those two wrote a book together on Convex Optimization)
- Maybe there exists no proof of some limit of the worst-case convergenz since it is a difficult problem to deal with, and conditioning is more of a "Trial-and-Error"-kinda problem, that depends on the specifics of the case it is used in.