I have read the following definition:
Diagonal matrices $\mathbf{D}$ can efficiently be raised to a power. Therefore, we can find a matrix power for a matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$ via the eigenvalue decomposition (if it exists) so that $\mathbf{A}^k = (\mathbf{P}\mathbf{D}\mathbf{P}^{−1})^k = \mathbf{P}\mathbf{D}^k\mathbf{P}^{−1}$.
Why do we only need to apply the $^k$ to $\mathbf{D}$? I know there is just some rule that I do not know but I could not figure it out.
Since the $P$ terms cancel out. Take for example $k = 2$, and consider $$A^2 = (PDP^{-1})^2 = P D P^{-1} P D P^{-1} = PD I DP^{-1} = PD^2 P^{-1}.$$ This pattern persists for $k > 2$.