This is given as part of showing the general Cayley-Hamilton theorem (so you shouldn't refer to Cayley-Hamilton in the proof).
Let $R$ be a ring, and $M$ a finitely-generated free $R$-module. Let $\phi:M\rightarrow M$ be an $R$-linear map, and $P_\phi(X)$ the characteristic polynomial of $\phi$. Suppose $R$ is an algebraically closed field.
Show that $P_\phi(\phi)=0$ by writing $\phi$ as a matrix in Jordan normal form.
Well, since $R$ is an algebraically closed field, all eigenvalues of $\phi$ lies in $R$, so the Jordan normal form exists. But what's the point of it then? I don't quite see.
Let $\lambda_1, \cdots, \lambda_k$ be the (distinct) roots of $\phi$ (by distinct, I mean that if they are not, I omit multiplicities). For instance, if a $4 \times 4$ matrix has characteristic polynomial $(x-2)^2(x-1)(x-3)$, then I would write $\lambda_1 = 1$, $\lambda_2 = 2$, $\lambda_3 = 3$.
Choose a basis for $M$, say $\{m_1,\cdots,m_n\}$ over which $\phi$ has Jordan normal form. (So two vectors for $\lambda_2$, one for $\lambda_1$ and one for $\lambda_3$.)
To each eigenvalue $\lambda_i$, there is a Jordan block with $\lambda_i$'s on the diagonal and (possibly some) ones on the second diagonal. (For instance, the block corresponding to $\lambda_1$ and $\lambda_3$ in my example has to be $(1)$ and $(3)$, but the block for $2$ could be either $\begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}$ or $\begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix}$. )
In particular, if $m_{i_1}, \cdots, m_{i_{\ell}}$ are the vectors who span the subspace over which $\phi$ has Jordan block form with eigenvalue $\lambda_i$, then if $\lambda_i$ has multiplicity $n_i$, we have $$ (\phi - \lambda_i I)^{n_i}(m_{i_j}) = 0 \quad \Rightarrow \quad P_{\phi}(\phi)(m_{i_j}) = 0. $$ (I implicitly used the fact that the matrix $A$ with $0$'s on the main diagonal and $1$'s in the second diagonal satisfies $A^{n_i} = 0$ where $n_i$ is the size of the block, which implies the $(\phi - \lambda_i I)^{n_i}(m_{i_j}) = 0$ part.) Note that the implication follows because $n_i$ is precisely the power of $(X - \lambda_i)$ which divides $P_{\phi}(X)$, so we can multiply $(\phi - \lambda_i I)^{n_i}$ by the missing factors of $P_{\phi}(\phi)$ to obtain the full polynomial.
This means that $P_{\phi}(\phi) (m_i) = 0$ for all $1 \le i \le n$, hence $P_{\phi}(\phi) = 0$.