Intuition behind this construction regarding eigenvalues?

77 Views Asked by At

In "Methods of Nonlinear Analysis: Applications to Differential Equations" (P. Drabek, J. Milota), they present the following construction:

Let $A\in L(X)$, and choose $\lambda\in\sigma(A)$, then set $N_k = \text{ker}(\lambda I-A)^k$. It is obvious that $N_k\subset N_{k+1}$, and they cannot be all distinct. If $N_k = N_{k+1}$ then $N_k = N_i$ for all $i>k$. Denote $n(\lambda)$ the least such $k$ and set $$ N(\lambda) = N_{n(\lambda)},\qquad R(\lambda)=\text{Im}(\lambda I - A)^{n(\lambda)} $$ Then both $N(\lambda)$ and $R(\lambda)$ are $A$-invariant and the decomposition $X = N(\lambda)\oplus R(\lambda)$ holds.

Here $X$ is a vector space, $L(X)$ is the set of linear transformations on $X$ to itself, $\sigma(A)$ is the spectrum of $A$ (the set of eigenvalues of $A$), and an $A$-invariant subspace $S$ of $X$ is one such that $A(S)\subset S$.

What is the intuition behind this construction, if any? I thought at first maybe $n(\lambda)$ corresponded to the multiplicity of the eigenvalue $\lambda$, but really it appears the multiplicity of $\lambda$ only corresponds to the dimension of $\text{ker}(\lambda I - A)$, and has nothing really to do with the $k$th power of $\lambda I - A$. How did they come up with this construction, and what is the intuitive idea to $n(\lambda)$, $N(\lambda)$, and $R(\lambda)$?

Edit: Reading on in the text, the $n(\lambda)$ is the multiplicity of $\lambda$, meaning my interpretation of multiplicity was completely wrong. I thought the multiplicity of an eigenvalue was the number of linearly independent eigenvectors corresponding to that eigenvalue, but apparently not. Instead, it appears to be only the exponent of $(t-\lambda)$ in the characteristic polynomial for $A$. Then my confusion lies in my understanding of multiplicity.

2

There are 2 best solutions below

0
On BEST ANSWER

I wanted to type a really long answer, but it's getting late here in Europe and I'm falling asleep somewhat so I'm just giving you a pointer to how to find more info.

First take the result as the theorem for granted. Now let $\mu$ be a different eigenvalue of $A$. It is easy to see that $R(\lambda)$ is mapped to itself by all powers of $(A - \mu I)$ so we can repeat the theorem with $X$ replaced by $R(\lambda)$ and doing this over and over and over we end up with a decomposition

$$X = N(\lambda) \oplus N(\mu) \oplus N(\nu) \oplus ... = \bigoplus_{\lambda' \in \sigma(A)} N(\lambda')$$

Especially in the case where $X$ is finite dimensional is very pleasant to work with because we have no trouble interpreting the big direct sum symbol.

The space $N(\lambda)$ are called the generalized eigenspace at eigenvalue $\lambda$. What is important to realize is that each of the generalized eigen spaces is mapped into itself by $A$ so we can study $A$ by studying its action on each of these spaces separately.

The decompositon of $X$ into generalized eigenspaces is called the Jordan decompostion (at least in the finite dimensional case). If you pick a basis of $X$ where each basis vector lies in a generalized eigenspace (and successive basiselements lie in the same eigenspace) then the matrix representing $A$ becomes a block matrix with only the 'diagonal' blocks are non-zero. (This is a restatement of the fact that $A$ maps the gen. eigenspaces to themselves.) Now if you pick your basis such that these diagonal blocks look really nice, you get the Jordan Cannonical Form of $A$.

I guess that if you google for any of the bold terms you will find some explanation of how people came up with it or what is the intuition. If that fails I will maybe write something about it myself tomorrow.

EDITED IN AFTER READING YOUR EDIT: two more terms to google for geometic multiplicity and algebraic multiplicity. The first is what you thought multiplicity was and the second is what you now found that it is. The fact that people invented words to make discussion of the difference more easy indicates that your idea was not 'completely wrong', just incomplete

0
On

This is maybe best understood by seeing examples. The simplest cases are the nilpotent matrices, whose sole eigenvalue is zero, with multiplicity equal to the rank of the matrix.

$$ A = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} \qquad n(0) = 1$$

$$ B = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} \qquad n(0) = 2$$

$$ C = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix} \qquad n(0) = 3$$

$$ D = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad n(0) = 2$$

$$ E = \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad n(0) = 1$$

In all cases, $n(0)$ is the smallest exponent that powers the matrix to zero. E.g. $C^2 \neq 0$ but $C^3 = 0$.

In each example, every vector is a (generalized) nullvector, since repeatedly applying the matrix to any vector will eventually produce the zero vector.

However, there is more refined information that is useful to know: for each vector we can ask how many times we need to apply the matrix to produce zero. The $N_k(0)$ are precisely that filtration.

$D$ is probably the most interesting example. The filtration on its generalized nullspace is

$$ N_0(0) = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \qquad N_1(0) = \begin{pmatrix} 0 \\ 0 \\ * \end{pmatrix} \qquad N(0) = N_2(0) = \begin{pmatrix} * \\ * \\ * \end{pmatrix} $$

Then, we can understand the action of $D$ as progressively advancing a generalized nullvector through these subspaces: applying $D$ to an element of $N_2(0)$ returns an element of $N_1(0)$. Applying $D$ to an element of $N_1(0)$ returns an element of $N_0(0)$ (i.e. zero).


Incidentally, you've made an error in your question — $n(\lambda)$ is the exponent on $(t - \lambda)$ in the minimal polynomial of the matrix. That need not be the same as the exponent in the characteristic polynomial.

For example, in all of the $3 \times 3$ examples above, the characteristic polynomials are all $t^3$. However, the minimal polynomials are

$$ m_C(t) = t^3 \qquad m_D(t) = t^2 \qquad m_E(t) = t $$