About the partial derivative respect to a scalar

263 Views Asked by At

I have an equation $L(\alpha)=\frac{1}{2}\log |\Phi(\alpha I)\Phi^{\rm T}+\beta I|$ where $\Phi$ is a matrix, $\alpha$ and $\beta$ are scalars. And I want to get $\frac{\partial L(\alpha)}{\partial \alpha}$. Hence, I apply the chain rule and will have:

Firstly, set $X(\alpha)=\Phi(\alpha I)\Phi^{\rm T}+\beta I$, then will have $$ \frac{\partial L(\alpha)}{\partial \alpha}=\frac{\partial L(\alpha)}{\partial |X(\alpha)|}\frac {\partial |X(\alpha)|}{\partial X(\alpha)}\frac{\partial X(\alpha)}{\partial \alpha} $$ After that I use the result in Matrix Cookbook (Page 9) which is $$ \frac{\partial det(X)}{\partial X}=det(X)(X^{-1})^{\rm T} $$ then will have: $$ \frac{\partial L(\alpha)}{\partial \alpha}=\frac{1}{2}[(\Phi(\alpha I)\Phi^{\rm T}+\beta I)^{-1}]^{\rm T}\Phi\Phi^{\rm T} $$ However, this is contradict to the foregoing result in Cookbook (Page 8) which is

$$ \frac{\partial det(Y)}{\partial x}=det(Y)Tr[Y^{-1}\frac{\partial Y}{\partial x}] $$

My classmate told me to use the second rule, however, I still don't know why I should use it and why I cannot use the foregoing chain rule to prove it?

Could anybody help me out about that?

1

There are 1 best solutions below

1
On BEST ANSWER

For ease of typing, let $$\eqalign{ X &= \alpha \Phi\Phi^T + \beta I \,\,\implies dX = \Phi\Phi^T\,d\alpha \cr \lambda &= 2L \cr }$$ Then we calculate the function, differential, and gradient as $$\eqalign{ \lambda &= \log(\det(X)) \cr\cr d\lambda &= d\log(\det(X)) \cr &= X^{-T}:dX \cr &= X^{-T}:\Phi\Phi^T\,d\alpha \cr\cr \frac{\partial\lambda}{\partial\alpha} &= X^{-T}:\Phi\Phi^T \cr &= {\rm tr}(X^{-1}\Phi\Phi^T) \cr }$$ Or, in terms of the original variables $$\eqalign{ \frac{\partial L}{\partial\alpha} &= \frac{1}{2}{\rm tr}\Big((\alpha \Phi\Phi^T + \beta I)^{-1}\Phi\Phi^T\Big) \cr }$$