I am encountering matrix calculus for the first time, and I'm completely lost on the following problem:
I'm trying to get the derivative of $$ (x-\mu)^T F (x - \mu) -1 =0 $$ where $x = (I+2\lambda F )^{-1} (y + 2 \lambda F \mu)$ which transforms the function to the following:
$$((I+2\lambda F )^{-1} (y + 2 \lambda F \mu) - \mu ) ^T F ((I+2\lambda F )^{-1} (y + 2 \lambda F \mu) - \mu ) -1 = 0$$
My intuition tells me that the gradient is just the following (extrapolating from the example of x)
$$ \nabla f(\lambda) = 2 ((I+2\lambda F )^{-1} (y + 2 \lambda F \mu) - \mu ) ^T F $$
In this case, we have $\lambda$ to be scalar, $\mu$ and $y$ are vectors and $F \succcurlyeq 0 $
Which seems to be incorrect, can someone shed light on how to approach this problem?
$ \def\x{(x-\mu)} \def\o{{\tt1}} \def\d{\dot} \def\A{A^{-\o}} \def\AD{{\d A}^{-\o}} \def\a{\alpha}\def\b{\beta}\def\l{\lambda} \def\qiq{\quad\implies\quad} \def\g#1#2{\frac{d #1}{d #2}} $Use a dot to denote derivatives with respect to $\l$ and note the following rules $$\eqalign{ \g{(Ab)}{\l} &= \d Ab + A\d b \qquad&\big({\rm derivative\:of\:a\:product}\big) \\ \d c &= 0 \qquad&\big({\rm derivative\:of\:a\:constant}\big) \\ \d\l &= \o \\ }$$ The derivative of a matrix inverse is tricky, but follows directly from these rules $$\eqalign{ I &= A\A \qquad&\big({\rm a\:matrix\:product}\big) \\ 0 &= \d A\A + A\,\AD \qquad&\big(I\:{\rm is\:a\:constant}\big) \\ \AD &= -\A\d A\A \qquad&\big({\rm solve\:for\:}\AD\big) \\ }$$ For typing convenience, define the variables $$\eqalign{ A &= I+2\l F,\qquad &\d A = 2F \\ b &= y+2\l F\mu,\qquad &\d b = 2F\mu \qquad\qquad\quad \\ }$$ Now we can differentiate $x$ $$\eqalign{ x &= {\A b} \\ \d x &= \A\d b - \A\d A\A b \\ &= \A(2F\mu) - \A(2F){\A b} \qquad\qquad \\ &= 2\A F(\mu-x) \\ }$$ Differentiating the main function yields $$\eqalign{ f &= \x^T F\x - \o \\ \d f &= \d x^T F\x + \x^T F\d x \qquad\qquad\quad \\ }$$ Assuming that $F$ is symmetric, this can be simplified to $$\eqalign{ \d f &= 2\,\x^T F\d x \\ &= 4\,\x^T F\A F(\mu-x) \qquad\qquad\quad \\ }$$