Partial derivative of matrix w.r.t. its eigenvalue and eigenvector

2.7k Views Asked by At

Standard algorithms have been proposed to compute the partial derivative of eigenvalue and eigenvector w.r.t. the matrix, e.g. http://www.win.tue.nl/casa/meetings/seminar/previous/_abstract051019_files/Presentation.pdf.

However, as far as I know, no solution has derived for the partial derivative of the matrix w.r.t. its eigenvalue and eigenvector. Suppose the matrix $A\in \mathbb{R}^{n \times n}$ with eigenvalue $\lambda\in \mathbb{R}$ and eigenvector $X\in \mathbb{R}^{n}$, i.e. $ AX = \lambda X$,the problem is to compute $\frac{\partial A}{\partial \lambda}$ and $\frac{\partial A}{\partial X}$. Does anyone have some ideas? Thanks for your help!

3

There are 3 best solutions below

1
On BEST ANSWER

@ bbl , you mix all these concepts. Firstly, an eigenvalue or a unitary eigenvector of $A=[a_{i,j}]$ is a function of the $a_{i,j}$

$f:A \rightarrow \lambda$ and $g:A\rightarrow x_{\lambda}$

and $Df_A$ or $Dg_A$ (when they exist !) are the derivatives of these functions and absolutely not PARTIAL derivatives.

On the other hand, when $A$ is diagonalizable, then (cf. Victor's post) $A$ is a function of $\Lambda$ and $X$ (the columns of $X$ are assumed to be unitary and linearly independent ; therefore the couple $(\Lambda,X)$ depends on $n^2$ parameters). Finally, under some hypothesis and precautions, we can consider PARTIAL derivatives with respect to $\Lambda$ or $X$.

Victor gives the first partial derivative in a pretty formula ; I prefer the following form:

$ \dfrac{\partial A}{\partial \lambda_i} = X_i [(X^{-T})_i]^T$.

1
On

Assuming the matrix has a full set of eigenvalues and eigenvectors, then you can write $A = X\Lambda X^{-1}$ where $X$ is the (columnwise) matrix of eigenvectors and $\Lambda$ is the diagonal matrix of eigenvalues. In this form, the derivative with respect to the eigenvalue is easy to see: $$ \frac{\partial A}{\partial \lambda_i} = X_i [(X^{-1})^T_i]^T$$ which is the rank-1 outer product between the left eigenvector and corresponding (suitably normalized) right eigenvector row.

With regards to the derivative with respect to the eigenvector, what do you expect the format of the result to be in, a rank-3 tensor?

0
On

$\def\L{\left}\def\R{\right}$Since no one has yet addressed it, here is the second of the requested gradients.

Using the same assumptions as Victor, calculate the differential wrt $X$ $$\eqalign{ A &= X\Lambda X^{-1} \\ dA &= dX\Lambda X^{-1} + X\Lambda dX^{-1} \\ &= dX\Lambda X^{-1} - X\Lambda X^{-1}\,dX\,X^{-1} \\ &= I_n\,dX\Lambda X^{-1} - A\,dX\,X^{-1} \\ }$$ Vectorizing both sides of this expression yields the desired gradient $$\eqalign{ da &= \Big((\Lambda X^{-1})^T\otimes I_n - X^{-T}\otimes A\Big)\,dx \\ &= \L(X^{-1}\otimes I_n\R)^T\L(\Lambda\otimes I_n-I_n\otimes A\R)dx \\ \frac{\partial a}{\partial x} &= \L(X^{-1}\otimes I_n\R)^T \L(\Lambda\otimes I_n - I_n\otimes A\R) \\ }$$