On the Cayley-Hamilton theorem

Question

On the Cayley-Hamilton theorem

1.3k Views Asked by user403337 At 11 May 2026 - 12:14

One of the nicest theorems in linear algebra is the one that a matrix satisfies its own characteristic polynomial, the so-called Cayley-Hamilton theorem.

What is a good way to prove it? In particular, does this elegant proof go through.

I am hopeful that it is quite trivial. Namely, since the characteristic polynomial is $\rm{det}(A-\lambda I)$, if we plug in $A$ for $\lambda$, we of course get $\rm{det}0=0$.

If so, this seems like one of the easiest times a couple of mathematicians got away with a major theorem.

To be precise, is there any problem with replacing $\lambda$, which usually denotes a scalar, with the matrix in question $A$.

Original Q&A

There are 3 best solutions below

Bumbble Comm On 29 May 2020 - 5:34

"The" proof of the Cayley-Hamilton Theorem involves invariant subspaces, or subspaces that are mapped onto themselves by a linear operator. If $T$ is a linear operator on a vector space $V$, then a subspace $W\subseteq V$ is called a $T$-invariant subspace of $V$ if $T(W)\subseteq W$, i.e. if $T(v)\in W$ for every $v\in W$. Some examples of $T$-invariant subspaces you might be familiar with are $\{0\}, N(T), R(T), V$, and $E_\lambda$ for any eigenvalue $\lambda$ of $T$. For a linear operator $T$ and any nonzero $x\in V$, then the subspace $$ W=\textrm{span}(\{x,T(x),T^2(x),\dots\})$$ is called the $T$ cyclic subspace of $V$ generated by $x$, and one can show that $W$ is the smallest $T$-invariant subspace containing $x$. Cyclic subspaces can be used to establish the Cayley-Hamilton Theorem. In fact, the existence of a $T$-invariant subspace allows us to define a new linear operator whose domain is this subspace, i.e. the restriction $T_W$ of $T$ to $W$ is a linear operator from $W$ to $W$. These two operators are linked in the sense that the characteristic polynomial of $T_W$ divides the characteristic polynomial of $T$. You can show this by choosing your favorite ordered basis for $W$ and extending it to an ordered basis for $V$, then taking the matrix representations of $T$ and $T_W$, and computing the characteristic polynomial of $T$, one will see that the characteristic polynomial of $T_W$ can be recovered.

The last tool we will need is how to gain information about the characteristic polynomial of $T$ from the characteristic polynomial of $T_W$. Cyclic subspaces are useful in this sense because the characteristic polynomial of the restriction of a linear operator $T$ to a cyclic subspace can be computed. In fact, if $T$ is a linear operator on a finite-dimensional vector space $V$, then if $W$ is the $T$ cyclic subspace of $V$ generated by a nonzero $v\in V$, and letting $k=\textrm{dim}(W)$, then we have that:

$\{v,T(v),T^2(v),\dots,T^{k-1}(v)\}$ is a basis for $W$
If $a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0$, then the characteristic polynomial of $T_W$ is $f(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$

I will omit the proof for the above theorem unless requested, since the main goal is the proof of the Cayley-Hamilton Theorem, which states that:

Let $T$ be a linear operator on a finite-dimensional vector space $V$, and let $f(t)$ be the characteristic polynomial of $T$. Then $f(T)=T_0$, the zero transformation. That is, $T$, "satisfies" its characteristic equation.

Proof: To show that $f(T)(v)=0$ for all $v\in V$. If $v=0$, we are done since $f(T)$ is linear, so suppose $v\neq 0$, and let $W$ be the $T$-cyclic subspace generated by $v$ with dimension $k$. By the theorem above, there exist scalars $a_0,\dots,a_{k-1}$ such that $$a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0 $$ and the characteristic polynomial for $T_W$ is: $$ g(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$$ Combining these two inequalities yields: $$g(T)(v)=(-1)^k(a_0I+a_1T+\cdots+a_{k-1}T^{k-1}+T^k)(v)=0 $$ We know that this polynomial divides the characteristic polynomial of $T$, $f(t)$, thus there exists a polynomial $q(t)$ such that $f(t)=q(t)g(t)$, so: $$ f(T)(v)=q(T)g(T)(v)=q(T)(g(T)(v))=q(T)(0)=0$$ The Cayley-Hamilton Theorem for Matrices is then a corollary to the Cayley-Hamilton Theorem stated above.

Bumbble Comm On 17 Apr 2022 - 10:47

The cleanest proof uses the adjugate, an algebraic complement of the matrix A.

See it in https://en.wikipedia.org/wiki/Cayley–Hamilton_theorem under "A direct algebraic proof".

Recall the definition of the adjugate. Replace every matrix element $a_{ij}$ of the $n\times n$ matrix $A$ with the determinant of the matrix $A$ with the row $i$ and column $j$ skipped, and a chessboard sign. Then transpose the matrix, to obtain the adjugate $\text{adj}(A)$. Then $A \ \text{adj}(A) = \det A I$. This is immediate from the expansion of $\det A$ by a row or a column.

Let the characteristic polynomial be $p(t) = \det(tI-A)= \sum c_i t^i I$ where $c_n=1$.

Take now $B = \text{adj}(tI - A)$ so $(tI - A)B = p(t) I$.

Expand $B$ in powers of $t$, as $B = \sum_{i=0}^{n-1} t^i B_i$, to get $B_{i-1}-AB_i = c_i I$ where $B_{n}=B_{-1}=0$.

Multiply from the left by $A^i$ to get $c_i A^i = A^i B_{i-1}-A^{i+1}B_i$.

Sum, to get after a telescopic cancellation $p(A)=\sum c_i A^i = 0$.

The nice part of this proof is that it gives an explicit expression for each term $c_i A^i$ using the determinant expressions for $B_i$, so it is amenable to generalizations, for instance to non commutative coefficients.

My high school linear algebra book, in Eastern Europe, was 1/4 in thick, printed on rough paper, but had this proof in it. Most of today's 2in thick undergrad books are very colorful, but skip the proof, as too hard.

**Bumbble Comm** · Accepted Answer

In older textbooks, the usual proof is to substitute $A$ into the characteristic polynomial $p(x)=\det(xI-A)$ in a correct manner. The proof is basically a one-liner: since $\operatorname{adj}(Ix-A)(Ix-A)=p(x)I$, we have $p(A)=0$ by the factor theorem. It probably offers the best explanation why the theorem holds (it's because substitution works if you do it correctly), but it involves many subtle points that beginners in linear algebra may find difficult to understand.

Among all linear algebra textbooks that I have ever read, Mac Duffee's Vectors and Matrices offers the clearest explanation to the above proof (see chapter IV). The following resources are also useful:

On the Cayley-Hamilton theorem

There are 3 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in EIGENVALUES-EIGENVECTORS

Related Questions in LINEAR-TRANSFORMATIONS

Related Questions in CAYLEY-HAMILTON

Trending Questions

Popular # Hahtags

Popular Questions