A matrix which has a factor of it's characteristic polynomial $\lambda^3 +1 = 0$ will be diagonalizable over $\mathbb C$:
$$\left[\begin{array}{ccc} -\frac 1 2 +i\frac {\sqrt{3}}{2} &0&0\\ 0&-\frac 1 2 -i\frac {\sqrt{3}}{2} &0\\ 0&0&1 \end{array}\right]$$ Block-diagonalizable over $\mathbb R$ with two blocks (one $1\times 1$ block $\&$ and one $2\times 2$):
$$\left[\begin{array}{cc|c} -\frac 1 2 &\frac {\sqrt{3}}{2} &0\\ -\frac {\sqrt{3}}{2}&-\frac 1 2 &0\\ \hline 0&0&1 \end{array}\right]$$
And block diagonal over $\mathbb Z$ (one $3\times 3$ block):
$$\left[\begin{array}{ccc}0&1&0\\0&0&1\\1&0&0\end{array}\right]$$
But in general: is there some computational way to determine and quantify what / which would be most storage / computationally efficient in a computer?
EDIT: If we just consider the storage and computational constraints when calculating power series expansions for matrix functions:
$$f({\bf A}) = \sum_{k=0}^{N}{\bf A}^k = \sum_{k=0}^{N}{\bf SD}^k{\bf S}^{-1}$$
So that the ${\bf D}^k$ calculations applied on vectors are fast.