derivative of a symmetric bilinear form (quadratic form version)

1.7k Views Asked by At

Let $A=A^T\in \mathbb R^{k\times k}$ be a nonzero symmetric matrix and define $F:\mathbb R^k\to\mathbb R$ by $$f(x):=x^TAx$$ Then why $df(x)\xi=2x^TA\xi$ for $x,\xi\in\mathbb R^k$?

2

There are 2 best solutions below

3
On BEST ANSWER

It's because

\begin{align}df(x)\xi &= \frac{d}{dt}|_{t = 0} f(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x + t\xi)^T A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T) A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T)(Ax + tA\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^TAx + t(\xi^T Ax + x^TA\xi) + t^2\xi^TA\xi)\\ &= \xi^T Ax + x^TA\xi\\ &= 2x^TA\xi \end{align}

0
On

There's a nice answer linked by alexjo using coordinates. Here's an answer without coordinates, using the fact that we know the derivative of a linear map:

Consider $h(x,y) = x^T A y$, with $h: \mathbb{R}^k \times \mathbb{R}^k \rightarrow \mathbb{R}$.

We can restrict $h$ to each factor with $h(x,-) : \{x\} \times \mathbb{R}^k \rightarrow \mathbb{R}$ and $h(-,y): \mathbb{R}^k \times \{y\} \rightarrow \mathbb{R}$.

Then $dh_{x,y} (\xi_x\oplus \xi_y) = dh_{x,y}(\xi_x \oplus 0) + dh_{x,y}(0\oplus \xi_y) = d(h(-,y))_x (\xi_x) + d(h(x,-))_y (\xi_y)$


Because $h(-,y)(x) = x^T A y = y^T A x$ is linear in $x$, we get $d(h(-,y))_x(\xi_x) = y^T A \xi_x$.

Similarly, because $h(x,-)(y) = x^T A y$ is linear in $y$, we get $d(h(x,-))_y(\xi_y) = x^T A \xi_y$.

Thus $dh_{x,y}(\xi_x \oplus \xi_y) = y^T A \xi_x + x^T A \xi_y$.


Finally, we have $f(x) = h(x,x) = h \circ \Delta$ for $\Delta: \mathbb{R}^k \rightarrow \mathbb{R}^k \times \mathbb{R}^k$ given by $\Delta(x) = (x,x)$.

We have $d\Delta_x(\xi) = \xi \oplus \xi$

Thus $df_x(\xi) = (dh_{x,x} \circ d\Delta_x)(\xi) = 2 x^T A \xi$.