derivative of square root of quadratic form with respect to matrix

170 Views Asked by At

How would I go about calculating $ \frac{d\textit{$\alpha$}}{d\boldsymbol{A}}$ for $$\textit{$\alpha$} = \sqrt{\boldsymbol{x}^{\intercal}\boldsymbol{A}\boldsymbol{x}}$$ where $\alpha$ is a scalar, $\boldsymbol{x}\in\mathbb{R}^{n}$, and $\boldsymbol{A}\in\mathbb{R}^{n\times n}$.

Sorry if this question is straightforward. I'm trying to implement an algorithm and came across this equation. I'm not familiar with matrix and vector derivatives. Also, any links to a comprehensive introduction to matrix/vector calculus would be appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

$x^\top A x$ can be written as $\sum_i \sum_j a_{ij} x_i x_j$. In this form, it is not hard to see what the partial derivative with respect to $a_{ij}$ is for any $i,j$. Then $\frac{d}{dA}(x^\top A x)$ can be viewed as a matrix consisting of all such partial derivatives.

To deal with $\sqrt{x^\top A x}$ you can just use the usual chain rule with the map $z \mapsto \sqrt{z}$.

2
On

Let us define the Frobenius product by a colon and use it's cyclic property \begin{align} {\rm Tr}\left( A^T B C \right) &:= A: BC \\ &= AC^T: B \end{align}

So, \begin{align} \alpha = \sqrt{x^T A x} \Longleftrightarrow \quad \alpha^2 = x^T A x \equiv x: Ax. \end{align}

Now, we can use differentials and then obtain gradient. \begin{align} 2 \alpha d\alpha &= x: dAx \\ &= xx^T:dA \\ \Longleftrightarrow d\alpha &= \frac{xx^T}{2 \alpha} :dA = \frac{xx^T}{2 \sqrt{x^T A x}} :dA \end{align}

The gradient is \begin{align} \frac{\partial \alpha}{\partial A} = \frac{xx^T}{2 \sqrt{x^T A x}}. \end{align}