It is known that any non-diagonalizable matrix, $A$, can be approximated by a set of diagonalizable matrices, e.g. $A \simeq \lim_{k \rightarrow \infty} A_k$. Why this is true?
Note: I was faced with it for the first time at a note about a simple proof for Cayley-Hamilton theorem, but I was not able to find it in my books nor in the internet to the extent I've googled.
If matrix has distinct eigenvalues, it can be diagonalized.
You can perturb a matrix by an arbitrarily small amount so that all eigenvalues are distinct.
Give any $A$, let $J$ be the Jordan form. That is, for some $V$, you have $A=U J U^{-1}$.
Let $\Delta= \operatorname{diag}(1,2,...,n)$, where $n$ is the dimension of the matrix, and consider the sequence $A_k = A+ \frac{1}{k} U\Delta U^{-1}$ (thanks to p.s. for catching my mistake here).
The eigenvalues of $A_k$ are $[J]_{ii} + \frac{1}{k} i$, hence for $k$ large enough, the eigenvalues are distinct, and hence $A_k$ is diagonalizable.
Clearly $A_k \to A$, hence the result.
Note: If we use the Schur form instead, we can explicitly compute the distance $\|A-A_k\|_2 = \frac{n}{k}$.