why the vector derivative of $\frac{d(x^Ta)}{dx} = \frac{d(a^Tx)}{dx} = a^T$, why it's $a^T$ not $a$

2k Views Asked by At

$\frac{d(x^Ta)}{dx} = \frac{d(a^Tx)}{dx} = a^T$

I was confused by this simple formula for a few weeks.

I thought $x^Ta$ is an scalar, and it's derivative respect to a column vector should be an vector, e.g. $a$ instead of $a^T$.

am I missing something?

Thank you !!

1

There are 1 best solutions below

2
On BEST ANSWER

Here's the definition of the derivative of a function $y: \Bbb R^n \to \Bbb R$ wrt the column vector $\mathbf x$. $$\frac{dy(\mathbf x)}{d\mathbf x} := \pmatrix{\frac{\partial y(\mathbf x)}{\partial x_1}, \dots, \frac{\partial y(\mathbf x)}{\partial x_n}}$$ Notice that this is a row vector (by definition).

$y(\mathbf x) = \mathbf a^T\mathbf x$ is a scalar function so the result of differentiating it wrt $\mathbf x$ must be some row vector. Now I assume you can prove that $\frac{d\mathbf a^T\mathbf x}{d\mathbf x} = \mathbf a^T = \frac{d\mathbf x^T\mathbf a}{d\mathbf x}$ in this case?