Krylov subspace for finding eigenvalues

1.3k Views Asked by At

Given a real square matrix $A$ and a vector $v$, Krylov subspaces are given by: $$\mathcal K_n(A,v) := \text{span}(v, Av, \cdots A^{n-1} v)$$ These spaces are known to help solve numerical linear algebra problems like approximating the eigenvalues $\lambda$ given by $Ax = \lambda x$. Can someone explain the core idea behind this?

2

There are 2 best solutions below

1
On

The Krylov methods are separated into four different algorithm groups, seen below. enter image description here

The way Krylov methods were taught to me was with the Arnoldi method, however, it presupposes you know about the $QR$ decomposition. First, the matrix $A$ is reduced to hessenberg form. The purpose of the Hessenberg reduction is to cut the number of computations in half. We want a matrix like this

$$ A =QHQ^{*} \tag{1}$$

A Hessenberg matrix is an upper triangular matrix but with a sub-diagonal. I.e

$$ H = \begin{bmatrix} 1 & 2 & 3 & 4\\ 1 & 2 & 3 & 4 \\ 0 & 3 & 4 & 9\\ 0 & 0 & 4 & 9\end{bmatrix} \tag{2}$$

is an example. Right so we form, a matrix $\hat{H}_{n} $ which is $ (n+1) \times n $ from each of these Hessen berg matrices. We get

$$ A Q_{n} = Q_{n+1} \hat{H}_{n} \tag{3}$$

which we can write as

$$ Aq_{n} = h_{1n}q_{1} + \cdots h_{nn}q_{n} + h_{n+1,n}q_{n+1} \tag{4} $$

Now the Krylov subspaces are formed when you do the following

$$ \mathcal{K}_{n} = \langle b , Ab ,\cdots A^{n-1}b \rangle = \langle q_{1} , q_{2} ,\cdots ,q_{n} \rangle \tag{5}$$

This allows us to describe the Arnoldi process (above) through Krylov subspaces

$$ \mathcal{K}_{n} = Q_{n}R_{n} \tag{6} $$

The effective way to look at this is the computation of projections onto successive Krylov subspaces using the QR decomposition.

$$ H_{n} = Q_{n}^{*} A Q_{n} \tag{7} $$

The matrix can be interpreted as in the basis $\{q_{1}, q_{2} \cdots , q_{n}\} $ of orthogonal projections of $A$ onto $\mathcal{K}_{n} $ The main reason this is useful is if the matrix (or operator) $A$ is very large and has a low rank. This is nothing more than an iterative low-rank matrix approximation.

This doesn't exactly answer your question but by looking at the Arnoldi method. enter image description here

It works like this.

  • Create normally distributed random vector i.e $ b \sim \mathcal{N}(0,1)$
  • Normalize it $ q_{1} = \frac{b}{\| b\|} $
  • Project vector into subspace $Aq_{1}$
  • Create hessenberg entry through dot product of two, $h_{jn} = q_{j}^{*}v$
  • Subtract off projection, $v = v - h_{jn}q_{j}$
  • the bottom two are pretty clear from the $QR$ decomp.
0
On

You have a good general answer by Rhowe, but please bear in mind that the various categories can be transformed into each other. For example we can make left hand side symmetric by solving normal equations. ${\bf A}^*{\bf Ax=A}^*{\bf b}$ instead of $\bf Ax=b$. So we can make a non-symmetric system solvable by C-G. It is not always a good idea (depending on matrix structure), but it is always possible.


That being said I think it for pedagogical purposes can be a good idea to visit the power method, as it in practice explores paths which vectors take when thrown into a sequence of Krylov subspaces.

The Krylov subspaces, spanned by $\{{\bf v,Av,A}^2{\bf v},\cdots\}$, we see that they in essence are spanned by the powers of the matrix multiplied to a vector of interest ($\bf v$). This is exactly what the power method does: it explores this subspace by calculating ${\bf A}^{k+1}{\bf v}, {\bf A}^{k}{\bf v}$ for some $k$, and then since eigenvalues accumulate by exponent of their norm when exponentiating the matrix, the biggest one will grow fastest (exponentially). So after a few iterations (relatively small $k$) we will have an approximation for the largest eigenvalue.

Then we start with a new $\bf v$, we first project away the found eigenvectors so that the new exploration of Krylov space will not be allowed to follow the same path as when we found the previous one. We will now get approximation of second largest eigenvalue, and so on...


Another way to view this, is that : Unless $\bf A$ is degenerate (has some eigenvalue 0) or nilpotent then the equation can be rewritten with equivalence:

$${\bf Ax}=\lambda {\bf x} \Leftrightarrow {\bf A}^{k+1}{\bf x} = \lambda {\bf A}^k {\bf x}$$

The second equation here is basically what the power method investigates.