Gradient of transpose of a vector.

14.9k Views Asked by At

I generalize from this question that $\nabla_x(x^TA) = \nabla_x(A^Tx)=A^T$.

However, I'm having trouble with $\nabla_{x^T}(x^TA)$. What does it mean to take the gradient of a transpose of a vector?

1

There are 1 best solutions below

2
On

There are some issues with the formula you wrote.

  1. First, $x^{T}A \neq A^{T}x$.
  2. Second, you can only take the gradient of a scalar function. This is normally defined as the column vector $\nabla f = \frac{\partial f}{\partial x^{T}}$. In order to take "gradients" of vector fields, you'd need to introduce higher order tensors and covariant derivatives, but that's another story.
  3. Maybe by $\nabla_{x}$ you meant $\frac{\partial}{\partial x}$. In that case, neither $\frac{\partial x^{T}A}{\partial x} = \frac{\partial Ax^{T}}{\partial x}$ nor $\frac{\partial x^{T}A}{\partial x} = A^{T}$ hold because of (1). Nevertheless $\frac{\partial A^{T}x}{\partial x} = A^{T}$ holds for obvious reasons.

Well, I don't want to be all negativity. Here are a couple of properties of the derivatives w.r.t. a vector.


Say you have two column vectors $x,y\in\mathbb{R}^{n}$ and a scalar function $f$. Then the derivative $\frac{\partial f}{\partial x}$ is a row vector, and the derivative $\frac{\partial f}{\partial x^{T}}$ is a column vector.

For the scalar $x^{T}y = y^{T}x$ you have $$\frac{\partial x^{T}}{\partial x}y = \frac{\partial x^{T}y}{\partial x} = \frac{\partial y^{T}x}{\partial x} = y^{T}\frac{\partial x}{\partial x} = y^{T}$$ $$y^{T}\frac{\partial x}{\partial x^{T}} = \frac{\partial y^{T}x}{\partial x^{T}} = \frac{\partial x^{T}y}{\partial x^{T}} = \frac{\partial x^{T}}{\partial x^{T}}y = y$$

But for the derivative of a vector w.r.t. another vector there are no nice formulas except for the obvious ones. $$\frac{\partial Ax}{\partial x} = A$$ $$\frac{\partial x^{T}A}{\partial x^{T}} = A$$