Why can I use this "shortcut" to compute the $k$th power of a matrix times a vector?

451 Views Asked by At

This question is taken from a past exam for Gilbert Strang's linear algebra course (Spring 1998).

Find a complete set of eigenvalues and eigenvectors for $\mathbf{A}=\begin{bmatrix}2&1&1\\1&2&1\\1&1&2\end{bmatrix}$. Write $\mathbf{b}=\begin{bmatrix}2&0&1\end{bmatrix}^\intercal$ as a linear combination of the eigenvectors and solve for $\mathbf{A}^{100}\mathbf{b}$.

The first two parts are easy enough; I find $\lambda_i=1,1,4$ with corresponding eigenvectors $$\mathbf{x}_i=\begin{bmatrix}-1\\1\\0\end{bmatrix},\begin{bmatrix}-1\\0\\1\end{bmatrix},\begin{bmatrix}1\\1\\1\end{bmatrix}$$ Then $\mathbf{b}$ can be obtained by the combination $\mathbf{x}_3-\mathbf{x}_1$.

The last part is what I'm not sure about. Letting $\mathbf{X}=\begin{bmatrix}\mathbf{x}_1&\mathbf{x}_2&\mathbf{x}_3\end{bmatrix}$, I can write $$\mathbf{A}=\mathbf{X}\begin{bmatrix}1&0&0\\0&1&0\\0&0&4\end{bmatrix}\mathbf{X}^{-1}$$ and so $$\mathbf{A}^{100}=\mathbf{X}\begin{bmatrix}1&0&0\\0&1&0\\0&0&4^{100}\end{bmatrix}\mathbf{X}^{-1}$$ Now, this being a time-sensitive exam question, I get a little suspicious that carrying out the multiplication of this resulting matrix by $\mathbf{b}$ will take too long, so I refer to the solution. (Note: There's a slight sign difference relative to the eigenvectors I use; the solution instead uses $\mathbf{y}_1=-\mathbf{x}_1$ so that $\mathbf{b}=\mathbf{x}_3\color{red}+\mathbf{y}_1$.) It simply says

$$\mathbf{A}^{100}\mathbf{b}=4^{100}\mathbf{x}_3+1^{100}\mathbf{y}_1=\begin{bmatrix}4^{100}+1\\4^{100}-1\\4^{100}\end{bmatrix}$$

which seems to be completely circumventing multiplication by $\mathbf{X}$ and its inverse.

It's not immediately clear to me why I can do something like this: if $\mathbf{x}$ is an eigenvector of $\mathbf{A}=\mathbf{X\Lambda X}^{-1}$, then for $k\in\mathbb{N}$, $$\mathbf{A}^k\mathbf{x}=\mathbf{X}\mathbf{\Lambda}^k\mathbf{X}^{-1}\mathbf{x}=\mathbf{\Lambda}^k\mathbf{x}$$ Does it have something directly to do with $\mathbf{x}$ being in the column space of $\mathbf{X}$?

4

There are 4 best solutions below

1
On BEST ANSWER

If $\lambda$ is an eigenvalue of $A$ corresponding to eigenvector $x$, then $\lambda^n$ is an eigenvalue of $A^n$ corresponding to eigenvector $x$. This can be proved by induction. The case $n=1$ is just the original assumption. Now assume $A^{n-1}x=\lambda^{n-1}x$. Then $A^nx=A(A^{n-1}x)=A(\lambda^{n-1}x)=\lambda^{n-1}Ax=\lambda^{n-1}\lambda x=\lambda^n x$.

Applying this to your problem \begin{align*} A^{100}b&=A^{100}(x_3-x_1)\\ &=A^{100}x_3-A^{100}x_1\\ &=4^{100}x_3-1^{100}x_1\\ &=\begin{bmatrix}4^{100}+1\\4^{100}-1\\4^{100} \end{bmatrix}. \end{align*}

0
On

You have $b=x_3+y_1$, where $Ax_3=4x_3$ and $Ay_1=1y_1$. Thus, you have: $$A^{100}b=A^{100}(x_3+y_1)=A^{100}x_3+A^{100}y_1=4^{100}x_3+1^{100}y_1$$ It's important to remember that for any eigenvector $x$ where $Ax=\lambda x$, multiplying $A$ successively is the same as multiplying $\lambda$ excessively, so $A^kx=\lambda^kx$.

0
On

You don't need to explicitly use the diagonalization $A= X \Lambda X^{-1}$. You know $A x_1 = x_1$ and $A x_3 = 4 x_3$, so \begin{align} A(x_3-x_1) &= 4x_3 - x_1\\ A^{100}(x_3-x_1) &= 4^{100} x_3 - x_1 \end{align}

1
On

One reason to compute the eigenvectors of a matrix is that they don't change direction when they are multiplied by the matrix. So, if an $n \times n$ matrix $A$ has a basis $x_1,\ldots,x_n$ of eigenvectors, then $A$ takes each $x_i$ to a scalar multiple $\lambda_i x_i$ of $x_i$ (so, the direction of an eigenvectors doesn't change upon multiplication by $A$, but the magnitude can change). The linear transformation represented by $A$ is essentially represented by a diagonal matrix with respect to a basis of eigenvectors (and diagonal matrices are simpler to understand and easier to work with).

This implies it is easy to calculate $Ab$ if $b$ is expressed a linear combination of the eigenvectors $x_i$. For example, if $b=\sum_{i=1}^{n} c_i x_i$, then $Ab = \sum_{i=1}^n c_i(Ax_i)=\sum_{i=1}^n c_i \lambda_i x_i$, where the first equality follows from linearity of $A$ and the second equality holds because the $x_i$'s are eigenvectors with eigenvalue $\lambda_i$. Notice that once $b$ is expressed in terms of the eigenvectors $x_1,\ldots,x_n$, because $A$ is linear, we could compute $Ab$ immediately (without doing any multiplication of $A$ with $b$).

In your example, observe that $A$ takes the eigenvector $x_i$ to $\lambda_i x_i$ for $i=1,2,3$. Hence, $A^2$ takes $x_i$ to $A(Ax_i)=A(\lambda_i x_i)=\lambda_i (Ax_i) = \lambda_i^2 x_i$. And $A^{100}$ takes $x_i$ to $\lambda_i^{100} x_i$. Since $b=x_3-x_1$, $A^{100}$ takes $b$ to $A^{100}b = A^{100}x_3 - A^{100} x_1 = \lambda_3^{100} x_3 - \lambda_1^{100} x_1$ $= 4^{100}(1,1,1)-1^{100}(-1,1,0) = (4^{100}-1,4^{100}+1,4^{100})$