One of the nicest theorems in linear algebra is the one that a matrix satisfies its own characteristic polynomial, the so-called Cayley-Hamilton theorem.
What is a good way to prove it? In particular, does this elegant proof go through.
I am hopeful that it is quite trivial. Namely, since the characteristic polynomial is $\rm{det}(A-\lambda I)$, if we plug in $A$ for $\lambda$, we of course get $\rm{det}0=0$.
If so, this seems like one of the easiest times a couple of mathematicians got away with a major theorem.
To be precise, is there any problem with replacing $\lambda$, which usually denotes a scalar, with the matrix in question $A$.
In older textbooks, the usual proof is to substitute $A$ into the characteristic polynomial $p(x)=\det(xI-A)$ in a correct manner. The proof is basically a one-liner: since $\operatorname{adj}(Ix-A)(Ix-A)=p(x)I$, we have $p(A)=0$ by the factor theorem. It probably offers the best explanation why the theorem holds (it's because substitution works if you do it correctly), but it involves many subtle points that beginners in linear algebra may find difficult to understand.
Among all linear algebra textbooks that I have ever read, Mac Duffee's Vectors and Matrices offers the clearest explanation to the above proof (see chapter IV). The following resources are also useful: