derivative of a vector's transpose w.r.t. itelf: $\frac{\partial(x^T)}{\partial x}$

120 Views Asked by At

I'm new to matrix calculus, and I'm confused about how to differentiate a vector $x$'s transpose w.r.t. itself. $\left(i.e. \dfrac{\partial(x^T)}{\partial x}\right)$

How would one calculate this derivative? From matrix calculator (https://www.matrixcalculus.org/) the result is I (identity matrix), but I can't figure out why. Suppose $x$ is a $n×1$ column vector and $x^T$ is a $1×n$ row vector, since $\frac{\partial }{\partial x}$ = $\left[\frac{\partial }{\partial x_{1}}, \frac{\partial }{\partial x_{2}},\ldots \frac{\partial }{\partial x_{n}}\right]$ , wouldn't $\dfrac{\partial (x^T)}{\partial x} = \dfrac{\partial }{\partial x}⊗x^T$ be a $1×n^2$ row vector instead of the $n×n$ identity matrix? I'm very confused and I don't know which part of my understanding is incorrect.

Thanks!

(P.S. I saw a similar question being asked about

d(x^T)/dx

here Derivative of vector and vector transpose product, but it hasn't seem to be resolved.)

Result from matrix calculator: enter image description here

1

There are 1 best solutions below

0
On

$ \def\s{{\left(1\right)}} \def\t{\times} \def\o{{\tt1}} \def\n{n} $Treating this as the gradient of second-order tensors with dimensions $(\o\t\n)$ and $(\n\t\o)$ will work, and yields a fourth-order tensor with dimensions $(\o\t\n\t\n\t\o)$.

But why stop there? Everybody knows that third-order tensors are the true elements of reality. So this problem should be treated as the gradient of $(\o\t\n\t\o)$ by $(\n\t\o\t\o)$ tensors.

Then someone else will claim that third-order tensors are for amateurs and the REAL calculation should be done using fourth-order tensors, i.e. as the gradient of $(\o\t\n\t\o\t\o)$ by $(\n\t\o\t\o\t\o)$ quantities.

Ad infinitum.

The opposite approach is to eliminate all of the singleton dimensions and treat this as a simple vector-by-vector gradient, yielding the $(\n\t\n)$ identity matrix.