Question about derivative of scalar function of matrix wrt scalar

94 Views Asked by At

Can someone help me understand the meaning of this derivative where i have a scalar, wich is a function of a matrix, when i take it with respect to another scalar? \begin{align} y = \mathrm{ln}|At+B|, \end{align} where $A$ and $B$ are matrices, $B$ is invertible, $t$ and $y$ are scalars.

With $X = At+B$ the function become \begin{align} y = \mathrm{ln}|X|. \end{align} The differentials are \begin{align} dy &= d\ \mathrm{ln}|X|\cr &= \mathrm{Tr}(X^{-1} dX)\cr &= X^{-T}:dX\cr dX &= (dAt+Adt+dB). \end{align}

Since i only ask about $t \implies dX = Adt$. Substituting $dX$ in $dy$ led to \begin{align} dy &= X^{-T}:dX\cr &= X^{-T}:Adt\cr &= A^TX^{-T}:dt.\cr \end{align}

Therefore the gradient is \begin{align} \frac{dy}{dt} &= A^TX^{-T}\cr &= A^T(At+B)^{-T},\cr \end{align} where the answer is an matrix for any $t$. I was expecting a scalar as a result.

How this answer can be interpreted? There is something wrong with my development?

Any help would be appreciated.

1

There are 1 best solutions below

0
On BEST ANSWER

Thanks to @greg comment that pointed out my mistake the correct answer is \begin{align} dy &= d\ \mathrm{ln}|X|\cr &= \mathrm{Tr}(X^{-1} dX)\cr &= \mathrm{Tr}(X^{-1} Adt)\cr &= \mathrm{Tr}(X^{-1} A)dt\cr \frac{dy}{dt} &= \mathrm{Tr}(X^{-1} A)\cr &= \mathrm{Tr}((At+B)^{-1} A). \end{align}