Taylor Expansion of Eigenvector Perturbation

1.9k Views Asked by At

Consider symmetric matrices $A,B \in \mathbb{R}^{n \times n}$ and let the difference matrix $D = B - A.$ If $\bf{\hat{a}}_i$ and $\bf{\hat{b}}_i$ are the $i^{\text{th}}$ eigenvectors corresponding to $A$ and $B$ respectively, and $\lambda_i$ is the $i^{\text{th}}$ eigenvalue of $A$, then the first-order Taylor expansion gives the following approximation for the difference between the primary eigenvectors: \begin{align} {\bf{\hat{b}}_1 - \bf{\hat{a}}_1} &= \sum_{i = 2}^n \frac{{{\bf{\hat{a}}_i}^T} D\; {\bf{\hat{a}}_1}}{\lambda_1 - \lambda_i}{\bf{\hat{a}}_i} + O(D^2)\\ &\approx \sum_{i = 2}^n \frac{{{\bf{\hat{a}}_i}^T} D\; {\bf{\hat{a}}_1}}{\lambda_1 - \lambda_i}{\bf{\hat{a}}_i} \end{align} My question is, where does this Taylor expansion come from?

1

There are 1 best solutions below

0
On BEST ANSWER

This is essentially the same process as what is called time-independent perturbation theory in quantum mechanics. Provided the $a_i$ form an orthonormal basis, so $ a_i^T a_j = \delta_{ij} $, we can always write $$ b_1 = \alpha\left( a_1 + \sum_{j \neq 1} \beta_j a_j \right), \\ \mu_1 = \lambda_1 + \nu, $$ where $\mu_1$ is the eigenvalue of $B$ with eigenvector $b_1$, and $\alpha$ is chosen so that $b_1^Tb_1=1$. Then the eigenvalue equation for $B$ is $$ Bb_1 = \mu_1 b_1, $$ which becomes $$ (A+D)\Big(a_i + \sum_{j \neq 1} \beta_j a_j \Big) = (\lambda_1+\nu)\Big(a_i + \sum_{j \neq 1} \beta_j a_j \Big), $$ cancelling $\alpha$. Using the eigenvalue equation for $a_i$ gives $$ Da_1 + \sum_{j \neq 1} \beta_j Da_j = \sum_{j \neq 1} (\lambda_1-\lambda_j)\alpha_j a_j + \nu \Big(a_1 + \sum_{j \neq 1} \beta_j a_j \Big). $$

Now applying $a_1^T$ and using orthonormality gives $$ \nu = a_1^T D a_1 + \sum_{j \neq 1} \beta_j a_1^T Da_j, \tag{1} $$ while applying $a_k^T$ for $k \neq 1$ gives $$ (\lambda_1-\lambda_j+\nu)\beta_k = a_k^T Da_1 + \sum_{j \neq 1} \beta_j a_k^T Da_j \tag{2} $$

So far, these expressions are exact. We suppose that the eigenvalues and eigenvectors vary smoothly with the matrix, so we can expand as follows: $$ \nu = 0+\epsilon \nu^{(1)} + O(\epsilon^2) \\ \alpha = 1 + \epsilon \alpha^{(1)} + O(\epsilon^2) \\ \beta_j = 0+\epsilon \beta_j^{(1)} + O(\epsilon^2), $$ which reflect the correct behaviour as $\epsilon=\lVert D \rVert \to 0$. Writing $D=\epsilon E$ and substituting these into $(1)$ and $(2)$ gives the first-order equations $$ \nu^{(1)} = a_1^T E a_1, \\ \beta_j^{(1)} = \frac{a_k^T Ea_1}{\lambda_1-\lambda_j}, $$ which is almost the expression you want, but with the necessity to find $\alpha^{(1)}$. We have $$ 1 = (1+2\epsilon\alpha^{(1)}+O(\epsilon^2)) \left( 1 + 2\epsilon\sum_{j \neq 1} \beta_j^{(1)} a_1^T a_j + O(\epsilon^2) \right) = 1+2\epsilon(\alpha^{(1)}+0)+O(\epsilon^2) $$ by orthogonality, so $\alpha^{(1)}=0$, and $\alpha$ does not affect the first-order term. Hence $$ b_1 = a_1 + \sum_{j \neq 1} \frac{a_1^TDa_j}{\lambda_1-\lambda_j}. $$ NB: this requires that $\lambda_1 \neq \lambda_j$. If there are other eigenvectors with the same eigenvalue, the expansion is more complicated. The Wikipedia article on perturbation theory explains this, although it does use quantum mechanics notation.