Gradient of a quadratic form — row or column?

1.5k Views Asked by At

I am taking derivative of

$$X^TAX, X \in {\rm I\!R}^n$$

using Frechet Derivative where $$f(x + h) = f(x) + <\nabla f(x), h> + O||h|| $$.

So I have

$$f(x + h) = X^TAX + X^TAh + h^TAX +h^TAh$$

and with the two terms in between, I have

$$<X^T(A+A^T), h> $$

and I think this $X^T(A+A^T)$ is the derivative of $X^TAX$. However, since $X$ is a $n \times 1$ vextor, while $X^T(A+A^T)$ is a $1 \times n$ vector. Am I doing anything wrong here? I saw some matrix calculus instructions also have this answer. I don't know what is happening.

So the problem is if I am doing gradient decent, I will have to do $x - \nabla f(x)$, but the dimensions don't match, so I think there must be something wrong.

1

There are 1 best solutions below

5
On

Let $f : \mathbb R^n \to \mathbb R$ be defined by $f (\mathrm x) := \mathrm x^{\top} \mathrm A \, \mathrm x$, where $\mathrm A \in \mathbb R^{n \times n}$ is given. Hence,

$$f (\mathrm x + h \mathrm v) = (\mathrm x + h \mathrm v)^{\top} \mathrm A \, (\mathrm x + h \mathrm v) = f (\mathrm x) + h \langle \mathrm v, \mathrm A \, \mathrm x \rangle + h \langle \mathrm A^{\top} \mathrm x, \mathrm v \rangle + h^2 \mathrm v^{\top} \mathrm A \, \mathrm v$$

The directional derivative of $f$ in the direction of $\mathrm v$ at $\mathrm x$ is, thus,

$$D_{\mathrm v} f (\mathrm x) = \langle \mathrm v, \mathrm A \, \mathrm x \rangle + \langle \mathrm A^{\top} \mathrm x, \mathrm v \rangle = \langle \mathrm v, (\mathrm A + \mathrm A^{\top}) \, \mathrm x \rangle$$

and the gradient of $f$ is

$$\boxed{\quad \nabla f (\mathrm x) = (\mathrm A + \mathrm A^{\top}) \, \mathrm x = 2\left(\frac{\mathrm A + \mathrm A^{\top}}{2}\right) \, \mathrm x \quad}$$

where $\dfrac{\mathrm A + \mathrm A^{\top}}{2}$ is the symmetric part of $\mathrm A$.