This seems like a trivial question but I am currently stuck and cannot see what I am doing wrong.
So let us consider a function $f(x) : \mathbb{R}^d \rightarrow \mathbb{R}^d$.
I want to compute the derivative w.r.t. $x \in \mathbb{R}^d$ of an expression that contains a quadratic form of $f(x)$
$$I = f(x)^{\top} C f(x) . $$
Here $C$ is a $d\times d$ matrix.
By taking the derivative w.r.t to the vector $x$ we have
$$ \frac{\partial I}{\partial x} = 2C f(x) \cdot \nabla f(x), $$ where $\nabla f(x)$ denotes the Jacobian of $f$ which will be a $d \times d$ matrix.
Now my problem is that the dimensions of the matrices in the last expression do not match: We have
- $C: d\times d$,
- $f(x): d\times 1$, and
- $\nabla f(x): d \times d$.
So the last two dimensions do not add up. What I am doing wrong? Is the correct derivative $$ \frac{\partial I}{\partial x} = \nabla f(x) 2 C f(x) , $$ or $$ \frac{\partial I}{\partial x} = ( 2 C f(x) )^{\top} \cdot \nabla f(x) $$
I found a related answer here: https://math.stackexchange.com/a/3128040/527323
Given a differentiable vector field $\mathrm f : \mathbb R^d \to \mathbb R^d$ and a matrix $\mathrm C \in \mathbb R^{d \times d}$, let function $F : \mathbb R^d \to \mathbb R$ be defined by
$$F (\mathrm x) := \langle \mathrm f (\mathrm x), \mathrm C \mathrm f (\mathrm x) \rangle$$
whose directional derivative in the direction of $\mathrm y \in \mathbb R^d$ at $\mathrm x \in \mathbb R^d$ is
$$D_{\mathrm y} F (\mathrm x) := \lim_{h \to 0} \frac{F (\mathrm x + h \mathrm y) - F (\mathrm x)}{h} = \cdots = \langle \mathrm y, \mathrm J_{\mathrm f}^\top (\mathrm x) \, \mathrm C \, \mathrm f (\mathrm x) \rangle + \langle \mathrm J_{\mathrm f}^\top (\mathrm x) \, \mathrm C^\top \mathrm f (\mathrm x) , \mathrm y \rangle$$
where matrix $\mathrm J_{\mathrm f} (\mathrm x)$ is the Jacobian of vector field $\rm f$ at $\mathrm x \in \mathbb R^d$.
Thus, the gradient of $F$ is
$$\nabla_{\mathrm x} F (\mathrm x) = \mathrm J_{\mathrm f}^\top (\mathrm x) \left( \mathrm C + \mathrm C^\top \right) \mathrm f (\mathrm x)$$