What is wrong with my derivation of Jordan normal form?

174 Views Asked by At

EDIT: A key error in my working is that it allows $M$ to be singular! I tried to solve $AM=MJ$, not realising that $M^{-1}AM=J$ requires $M^{-1}$ to exist and be completely linearly independent!

OP: Reading Wikipedia, I saw examples of JNF but did not see a clear explanation for why it works, so I attempted to show it to myself - and I thought I had proven it, until I read this post on MSE, which claimed that the only way to express the matrix in JNF was with a very specific block count, and block size - the number of blocks with a certain eigenvalue $\lambda$ must apparently be the nullity of $A-\lambda I$, which clashes entirely with my "understanding" of why JNF works.

$A\in M_{n\times n}$ can be written in JNF by $M^{-1}AM=J\implies AM=MJ$, where $M$ is the "modal" matrix, and $J$ is the JNF of $A$. Wikipedia's example constructed the JNF by calculating the eigenvectors and generalised eigenvectors, creating Jordan chains of such vectors, and grouping the chains in correspondence with the Jordan blocks. My erroneous "proof" went as follows:

For convenience, let the matrix $B=A-\lambda I$. Let the Jordan chain of some such eigenvalue $\lambda$, for which the generalised eigenvector is of rank $m$, be written $\mathfrak{Ch}(\lambda)=\{x_m,x_{m-1},\dots,x_1\}$, and let $\mathfrak{Ch}_i(\lambda)=x_i$ in the chain, where all the $x_i$ are related by $B^mx_m=0;x_1=B^{m-1}x_m\neq0;x_{i}=Bx_{i+1}$. $J=\bigoplus_{i=1}^n J_k(\lambda_i)$, where $\lambda_i$ is the relevant eigenvalue depending on your chosen order of constructing $J$, and $k$ is the size of the relevant Jordan block, depending on $i$. $M$ is constructed (I thought) by using, as its column vectors, any series of vectors from $\mathfrak{Ch}(\lambda_i)$, such that the eigenvalue $\lambda_i$ in that section of $M$ corresponds with the Jordan block in $J$.

The product $MJ$ is expressible in the form of a sequence of column vectors $c_i$. Letting the Jordan block corresponding to that column $i$ have a relative index $k$ to represent the $k$th column of the Jordan block and also to represent the $k$th column in the series of chain-vectors in $M$ (as they must correspond - I think!):

Since the Jordan block at some column $k$ has the form $\begin{pmatrix}\lambda\\0\\0\\\vdots\end{pmatrix}$,$\begin{pmatrix}1\\\lambda\\0\\\vdots\end{pmatrix}$ or $\begin{pmatrix}\vdots\\0\\1\\\lambda\\\vdots\end{pmatrix}$ the product for $c_i$ evaluates to $\mathfrak{Ch}_{k-1}(\lambda)+\lambda\cdot\mathfrak{Ch}_k(\lambda)$, defining "$\mathfrak{Ch}_0(\lambda)$" to be zero to account for the case $k=1$, where there is no trailing one in the Jordan block to catch the previous column of $M$ in the product. By the chain identities above, $\mathfrak{Ch}_{k-1}(\lambda)+\lambda\cdot\mathfrak{Ch}_k(\lambda)=B\times\mathfrak{Ch}_k(\lambda)+\lambda I\times\mathfrak{Ch}_k(\lambda))=(B+\lambda I)\times\mathfrak{Ch}_k(\lambda)=A\times\mathfrak{Ch}_k(\lambda)$, which I believe is precisely the product in the column $i$ of $AM$, therefore asserting $MJ=AM$ to be true when (and I'd have thought only when) $M, J$ are constructed as I've described.

Evidently (since my method comprehensively failed to solve the question that I linked, and clashes with what the answerer said) my idea of why JNF works and how we construct it is wrong - but why?

Many thanks!

2

There are 2 best solutions below

12
On BEST ANSWER

Your argument seems to go wrong as soon as you define $\mathfrak{Ch}(\lambda)$ to mean "the Jordan chain of some eigenvalue $\lambda$". If the Jordan form has more than one block with the same eigenvalue, it will be because there are several independent Jordan chains for the same $\lambda$ -- in other words, a single chain might not be enough to span the entire generalized eigenspace.


If you want to think of just one chain, it needs to be a chain of subspaces, rather than single vectors: $$ V_{n+1} = V_n \supseteq V_{n-1} \supseteq \cdots \supseteq V_1 \supseteq V_0 = \{0\} $$ where each $V_k$ is the preimage of $V_{k-1}$ under $A-\lambda I$. Or, in other words, $V_k$ is the null space of $(A-\lambda I)^k$.

If we just choose a basis for $V_1$ and extend it successively to $V_2, V_3, \ldots V_n$, the matrix for $A$ (assuming for simplicity that $\lambda$ is its only eigenvalue) under the resulting basis will be something you might call a "generalized Jordan block"

$$\begin{pmatrix} \lambda I_{d_1\times d_1} & F_1 & 0 & \cdots & 0 \\ 0 & \lambda I_{d_2\times d_2} & F_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda I_{d_{n-1}\times d_{n-1}} & F_{n-1} \\ 0 & 0 & \cdots & 0 & \lambda I_{d_n\times d_n} \end{pmatrix} $$ where $d_k = \dim V_k/V_{k-1}$.

This is not quite nice -- we have $\lambda$s on the diagonal like we want, but the mediating blocks $F_k$ can be arbitrary (almost: it turns out each of them has full rank), and they're not even necessarily square. By constructing our basis with a bit more care, we can achieve a nicer form, namely where each $F_k$ is an identity block, possibly extended with some zero rows on the bottom. Then a suitable permutation of the basis will produce a proper Jordan form consisting of $d_1$ Jordan blocks.

First set $\mathfrak B_{n+1}=\varnothing$. It will be an invariant that $\mathfrak B_k$ is a "basis for $V_k/V_{k-1}$", by which slight abuse of terminology I mean that $\mathfrak B_k$ together with any basis for $V_{k-1}$ will constitute a basis for $V_k$. (More precisely, $\mathfrak B_k$ consists of representatives of the cosets chosen as basis vectors for the quotient space $V_k/V_{k-1}$). In particular, $\mathfrak B_k \subseteq V_k \setminus V_{k-1}$.

Now, to construct $\mathfrak B_k$ for $k=n, n-1, n-2, \ldots, 1$ (in that order), start by considering the vectors $$ \mathfrak B'_{k+1} = (A-\lambda I)\mathfrak B_{k+1} = \{ (A-\lambda I)v \mid v\in \mathfrak B_{k+1} \} \subseteq V_k $$ No nontrivial linear combination of these vectors can be in $V_{k-1}$, because the corresponding combination of vectors in $\mathfrak B_{k+1}$ would then be in $V_{k}$, which contradicts $\mathfrak B_{k+1}$ being a basis for $V_{k+1}/V_k$. In particular, this means that $\mathfrak B'_{k+1}$ is linearly independent, and it remains linearly independent together with a basis for $V_{k-1}$. So we can extend this combination to a basis for $V_k$, and forget the temporary basis for $V_{k-1}$. Now, let $\mathfrak B_k$ be the resulting set.

At the end of this, $\mathfrak B_1 \cup \cdots \cup \mathfrak B_n$ is a basis for the generalized eigenspace. And more: Whenever $|\mathfrak B_k|>|\mathfrak B_{k+1}|$ -- that is, when we needed to add additional vectors to $\mathfrak B'_{k+1}$ -- each of those additional vectors is the beginning of a Jordan chain whose elements are in $\mathfrak B_k, \mathfrak B_{k-1}, \ldots, \mathfrak B_1$, and this chain produces exactly a Jordan block of its own.

Note that the number and sizes of Jordan blocks we get this way depend only on the dimensions of the subspaces $V_k$, which had a completely basis-free definition. Namely, there are $\dim V_k/V_{k-1} - \dim V_{k+1}/V_{k}$ Jordan blocks of size $k$. It is simple to verify that this formula holds in particular if $A$ was already in Jordan form. So it is true not only for the Jordan form we constructed by the above, but for every Jordan form of a matrix.

The total number of blocks of any size (for the eigenvalue $\lambda$) is $$ \sum_{k=1}^n \bigl[\dim V_{k}/V_{k-1} - \dim V_{k+1}/V_{k}\bigr] = \dim V_1 / V_0 - \dim V_{n+1}/V_n = \dim V_1 $$ that is, just the nullity of $A-\lambda I$.

3
On

I suspect you have not worked through many matrices where the Jordan form is not diagonal. As with continued fractions, the development is alphabet soup; better to do examples.

These are from a book by Evar D. Nering, second edition. I will put both examples here, check for typing errors... the eigenvalues are integers,

Made up this one,

$$ C_{jagy} = \left( \begin{array}{ccc} 2&7&9 \\ 0&2&8 \\ 0&0&2 \\ \end{array} \right) $$

When everything is an integer, I like a sort of backwards order: the one denominator is in finding the inverse of a square matix. Here, $(C-2I)^3 = 0.$ To get the 1's above the diagonal, find column $p_3$ with $(C-2I)^2 p_3 \neq 0.$ Then column $p_2 = (C-2I)p_3.$ Finally $p_1 = (C-2I)p_2$ is a genuine eigenvector. Make the matrix $P$ from the columns $p_1, p_2, p_3.$ The determinant of your choice of $P$ is the denominator in finding $P^{-1}.$ Finally $J=P^{-1}C P$ is in Jordan form.

$$ \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc \bigcirc $$

$$ A_{nering} = \left( \begin{array}{ccccc} 1&0&-1&1&0 \\ -4&1&-3&2&1 \\ -2&-1&0&1&1 \\ -3&-1&-3&4&1 \\ -8&-2&-7&5&4 \\ \end{array} \right) $$


$$ B_{nering} = \left( \begin{array}{ccccc} 5&-1&-3&2&-5 \\ 0&2&0&0&0 \\ 1&0&1&1&-2 \\ 0&-1&0&3&1 \\ 1&-1&-1&1&1 \\ \end{array} \right) $$