Let $A=A^T\in \mathbb R^{k\times k}$ be a nonzero symmetric matrix and define $F:\mathbb R^k\to\mathbb R$ by $$f(x):=x^TAx$$ Then why $df(x)\xi=2x^TA\xi$ for $x,\xi\in\mathbb R^k$?
derivative of a symmetric bilinear form (quadratic form version)
1.7k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
There's a nice answer linked by alexjo using coordinates. Here's an answer without coordinates, using the fact that we know the derivative of a linear map:
Consider $h(x,y) = x^T A y$, with $h: \mathbb{R}^k \times \mathbb{R}^k \rightarrow \mathbb{R}$.
We can restrict $h$ to each factor with $h(x,-) : \{x\} \times \mathbb{R}^k \rightarrow \mathbb{R}$ and $h(-,y): \mathbb{R}^k \times \{y\} \rightarrow \mathbb{R}$.
Then $dh_{x,y} (\xi_x\oplus \xi_y) = dh_{x,y}(\xi_x \oplus 0) + dh_{x,y}(0\oplus \xi_y) = d(h(-,y))_x (\xi_x) + d(h(x,-))_y (\xi_y)$
Because $h(-,y)(x) = x^T A y = y^T A x$ is linear in $x$, we get $d(h(-,y))_x(\xi_x) = y^T A \xi_x$.
Similarly, because $h(x,-)(y) = x^T A y$ is linear in $y$, we get $d(h(x,-))_y(\xi_y) = x^T A \xi_y$.
Thus $dh_{x,y}(\xi_x \oplus \xi_y) = y^T A \xi_x + x^T A \xi_y$.
Finally, we have $f(x) = h(x,x) = h \circ \Delta$ for $\Delta: \mathbb{R}^k \rightarrow \mathbb{R}^k \times \mathbb{R}^k$ given by $\Delta(x) = (x,x)$.
We have $d\Delta_x(\xi) = \xi \oplus \xi$
Thus $df_x(\xi) = (dh_{x,x} \circ d\Delta_x)(\xi) = 2 x^T A \xi$.
It's because
\begin{align}df(x)\xi &= \frac{d}{dt}|_{t = 0} f(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x + t\xi)^T A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T) A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T)(Ax + tA\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^TAx + t(\xi^T Ax + x^TA\xi) + t^2\xi^TA\xi)\\ &= \xi^T Ax + x^TA\xi\\ &= 2x^TA\xi \end{align}