Proof for Diagonal Matrices from Page 2 of 7:
Let $A \in M_{n}(C)$ be diagonal, to wit, $A _{ii}=\lambda_{i}$.
Then $
p_{A}(t) = \det(tI-A)= \det \begin{bmatrix}
t - \lambda_1 & ~ & ~ \\
~ & \ddots & ~ \\
~ & ~ & t - \lambda_n \\
\end{bmatrix}
=\prod_{i=1}^{n}(t-\lambda_{i})
\quad (♦)$
and $p_A(A)= \prod_{i=1}^{n}(A- \color{forestgreen}{ \lambda_iI } ) $ , a product of diagonal matrices.
$1.$ How does $p_A(A)= \prod_{i=1}^{n}(A-\lambda_{i}I) $ ? $(♦)$ contains $\lambda_i$ and NOT $\color{forestgreen}{ \lambda_iI }$ ?
What legitimates this? $t$ is a variable but A is a matrix, so they can't be equal?Does the proof repeat this technique for the last line of this proof, denoted with $\color{ orangered }{ ( \yen ) }$?
As in the previous examples (on the linked PDF in the first sentence), since $A$ is diagonal, $$ p_{A}(A) \mathop{=}^{\color{ red }{\clubsuit} } \begin{bmatrix} p_A(\lambda_1) & ~ & ~ \\ ~ & \ddots & ~ \\ ~ & ~ & p_A(\lambda_n) \\ \end{bmatrix} = \begin{bmatrix} \prod_{i=1}^{n}(\lambda_1 -\lambda_{i}) & ~ & ~ \\ ~ & \ddots & ~ \\ ~ & ~ & \prod_{i=1}^{n}(\lambda_n -\lambda_{i}) \\ \end{bmatrix} = \text{ 0 matrix },$$ where $\prod_{i=1}^{n}(\lambda_n -\lambda_{i}) = ...(\lambda_n -\lambda_{n})= 0 $, and the same holds for all the other diagonal entries.
$2.$ How does $p_{A}(A)$ equal that diagonal matrix, as denoted with $\color{ red }{ ( \clubsuit )} $ ?
Proof for Diagonalisable Matrices: Similar matrices have the same eigenvalues (and thus characteristic polynomials), so suppose for similar matrices A and $B$ (now $A$ may NOT be diagonal): $ p_{A}(z)=p_{B}(z)=\displaystyle \sum_{i=0}^{n}c_{i}z^{i} \implies p_{A}(A)=\sum_{i=0}^{n}c_{i}A^{i} \quad \color{ orangered }{ ( \yen ) } $ (I omit the rest of the proof.)
I believe what is happening is that the two "functions" $p_A(t)$ and $p_A(A)$ are defined by a product. $\prod_{i=1}^n(t-\lambda_i)$,$\prod_{i=1}^n(A-\lambda_iI)$,
In $1$ Dimension, $p_A(t)=\det(tI-A)$, i.e. in this case the product and the determinant are the same.
Clearly this is not the same in higher dimensions.
Now $p_A(A)$ is itself a diagonal matrix (since it is the product of diagonal matrices).
This is because $p_A(A)=\prod_{i=1}^n(A-\lambda_iI)$, for each $i$, $(A-\lambda_iI)$ is a diagonal matrix (because both $A$ and $I$ are).
Thus $p_A(A)=\prod_{i=1}^n(A-\lambda_iI)=(A-\lambda_1I)...(A-\lambda_nI)$, is a product of diagonal matrices.
We see that the $j$-th diagonal entry is: $\prod_{i=1}^n(A_{jj}-\lambda_i)=\prod_{i=1}^n(\lambda_j-\lambda_i)=p_A(\lambda_j)$.
This because:
$p_A(A)=\prod_{i=1}^n(A-\lambda_iI)=(A-\lambda_1I)...(A-\lambda_nI)$
$=\left[\begin{array}{ccc} a_{11}-\lambda_1 & 0 & ... & 0\\ 0 & a_{22}-\lambda_1& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& a_{nn}-\lambda_1 \end{array} \right]...\left[\begin{array}{ccc} a_{11}-\lambda_n & 0 & ... & 0\\ 0 & a_{22}-\lambda_n& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& a_{nn}-\lambda_n \end{array} \right]$
$=\left[\begin{array}{ccc} (a_{11}-\lambda_1)...(a_{11}-\lambda_n) & 0 & ... & 0\\ 0 & (a_{22}-\lambda_1)...(a_{22}-\lambda_n)& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& (a_{nn}-\lambda_1)...(a_{nn}-\lambda_n) \end{array} \right]$
$=\left[\begin{array}{ccc} \prod_{j=1}^n(a_{11}-\lambda_j) & 0 & ... & 0\\ 0 & \prod_{j=1}^n(a_{22}-\lambda_j)& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& \prod_{j=1}^n(a_{nn}-\lambda_j) \end{array} \right]$
$=\left[\begin{array}{ccc} \prod_{j=1}^n(\lambda_1-\lambda_j) & 0 & ... & 0\\ 0 & \prod_{j=1}^n(\lambda_2-\lambda_j)& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& \prod_{j=1}^n(\lambda_n-\lambda_j) \end{array} \right]$
$=\left[\begin{array}{ccc} p_A(\lambda_1) & 0 & ... & 0\\ 0 & p_A(\lambda_2)& ...& 0 \\ ...& ...& ...& ...\\ 0 & ... & ...& p_A(\lambda_n) \end{array} \right]$
From this we see that each diagonal entry in the matrix correspond to what you have posted.
Please ask if anything is unclear.
edit
They are related in the sense that substituting $A$ for $t$ in $p_A(t)$ will give you $p_A(A)$, but the latter is an $n×n$ matrix, whereas, $p_A(t)$ is a scalar, so they only truly coincide if A is a one dimensional matrix.
The reason that substituting $A$ for $t$ in $p_A(t)$ gives us $p_A(A)$, is as follows,
Let $t=A$ (and in the formula change $\lambda_i$ to $\lambda_i I$ so that we can add, subtract and multiply with the correct dimensions), and we get:
edit
You are correct here, because the dimensions don't agree, we need to let $t$ be the matrix $A$ and let $\lambda_i$ be $\lambda_i I$ so really we should have:
$t\to A$, $\lambda_i\to \lambda_i I$
$p_A(t)=\prod_{i=1}^n(t-\lambda_i)\to\prod_{i=1}^n(A-\lambda_i I)=p_A(A)$.