Derivative of matrices involving transpose and inverse

59 Views Asked by At

I have an equation which looks like this: $$Z = [{A(A^TA + \lambda I)^{-1}A^TB - B}]^T[{A(A^TA + \lambda I)^{-1}A^TB - B}]$$ Here, $\lambda$ is scalar ($\lambda > 0$) and $I$ indenty matrix such that
$$\mathbf{\lambda I}=\begin{bmatrix} \lambda & 0\\ 0 & \lambda\\ \end{bmatrix}$$ I want to find $\frac{\partial Z}{\partial \lambda}$ in order to prove that if $\lambda_1 \geq \lambda_2$ then $Z_1 \geq Z_2$.

Here, is what I tried
$\frac{\partial Z}{\partial \lambda} = \frac{\partial}{\partial \lambda} [{A(A^TA + \lambda I)^{-1}A^TB - B}]^T[{A(A^TA + \lambda I)^{-1}A^TB - B}]$

$\frac{\partial Z}{\partial \lambda} = 2\frac{\partial}{\partial \lambda} [{A(A^TA + \lambda I)^{-1}A^TB - B}]$ ...($\frac{\partial X^TX}{\partial X}$ = 2X)

$\frac{\partial Z}{\partial \lambda} = 2[\frac{\partial}{\partial \lambda} (A) * [(A^TA + \lambda I)^{-1}A^TB] + A[\frac{\partial}{\partial \lambda} (A^TA + \lambda I)^{-1} * (A^TB) + (A^TA + \lambda I)^{-1} * \frac{\partial}{\partial \lambda}A^TB] - \frac{\partial}{\partial \lambda}B]$

...($\frac{\partial MN}{\partial X}$ = MN' + M'N)

$\frac{\partial Z}{\partial \lambda} = 2[0 * [(A^TA + \lambda I)^{-1}A^TB] + A[\frac{\partial}{\partial \lambda} (A^TA + \lambda I)^{-1} * (A^TB) + (A^TA + \lambda I)^{-1} * 0]- 0]$ ...($\frac{\partial A}{\partial \lambda}$ = 0, $\frac{\partial B}{\partial \lambda}$ = 0, $\frac{\partial A^TB}{\partial \lambda}$ = 0)

$\frac{\partial Z}{\partial \lambda} = 2A\frac{\partial}{\partial \lambda} (A^TA + \lambda I)^{-1} * (A^TB)$

$\frac{\partial Z}{\partial \lambda} = -2A(A^TA + \lambda I)^{-1}\frac{\partial}{\partial \lambda} (A^TA + \lambda I) * (A^TA + \lambda I)^{-1}*(A^TB)$ ...($\frac{\partial M^{-1}}{\partial X} = -M^{-1}M'M^{-1}$)

$\frac{\partial Z}{\partial \lambda} = -2A(A^TA + \lambda I)^{-1}I(A^TA + \lambda I)^{-1}(A^TB)$ ...($\frac{\partial (A^TA + \lambda I)}{\partial \lambda} = I$)

I am unsure of this answer since I expected the derivative to be positive so that the above statement/proof can be proved.

1

There are 1 best solutions below

3
On

The only thing that you need to know here is that

$$\dfrac{d}{d\lambda}(A^TA+\lambda I)^{-1}=-(A^TA+\lambda I)^{-2}.$$

Therefore,

$$\dfrac{dZ}{d\lambda}=-[A(A^TA + \lambda I)^{-2}A^TB]^T[A(A^TA + \lambda I)^{-1}A^TB - B]-[A(A^TA + \lambda I)^{-1}A^TB - B]^TA(A^TA + \lambda I)^{-2}A^TB.$$

Expanding that yields

$$\begin{array}{rcl}\dfrac{dZ}{d\lambda}&=&2B^TA(A^TA + \lambda I)^{-2}A^TB-B^TA(A^TA + \lambda I)^{-2}A^TA(A^TA + \lambda I)^{-1}A^TB-B^TA(A^TA + \lambda I)^{-1}A^TA(A^TA + \lambda I)^{-2}A^TB\\ % &=&2B^TA(A^TA + \lambda I)^{-2}A^TB-2B^TA(A^TA + \lambda I)^{-2}A^TB+2\lambda B^TA(A^TA + \lambda I)^{-3}A^TB\\ % &=&2\lambda B^TA(A^TA + \lambda I)^{-3}A^TB % \end{array} $$ which is positive semidefinite.