Differentiate $x^Tx\cdot x$: scalar v.s. 1-by-1 matrix

63 Views Asked by At

Let $x\in \mathbb{R}^n$, then $y=x^Tx\cdot x\in \mathbb{R}^n$.

I try to find the differential of y:

\begin{align} \mathrm{d}y &= \mathrm{d}x^Tx\cdot x + x^Tx\cdot \mathrm{d}x \\ &= (\mathrm{d}x^T)x\cdot x + x^T\mathrm{d}x\cdot x + x^Tx\cdot \mathrm{d}x \\ &= \bigl((\mathrm{d}x^T)x\bigr)^T \cdot x + x^T\mathrm{d}x\cdot x + x^Tx\cdot \mathrm{d}x \\ &= 2x^T\mathrm{d}x\cdot x + x^Tx\cdot \mathrm{d}x \end{align}

…which looks strange to me, because in the first term, $\mathrm{d}x$ and $x$ are both in $ \mathbb{R}^n$, and yet they are multiplied together. Noticing that this happens in the original $y$ as well, I realize the issue lies in the ambiguity of $x^Tx$ (scalar or a 1-by-1 matrix?), so I try to circumvent this with trace (let $y=tr(x^Tx)\cdot x$):

\begin{align} \mathrm{d}y &= \mathrm{d}tr(x^Tx)\cdot x + tr(x^Tx)\mathrm{d}x \\ &= tr(\mathrm{d}x^Tx)x + tr(x^Tx)\mathrm{d}x \\ &= 2tr(x^T\mathrm{d}x)x + tr(x^Tx)\mathrm{d}x \end{align}

This seems right, but I don't know how to proceed.

What am I missing?

3

There are 3 best solutions below

0
On BEST ANSWER

To extend Jean-ClaudeArbaut's comment.
Write your function in a form which is valid whether $x$ is a vector or a matrix $$y=xx^Tx \qquad\qquad\qquad\quad$$ Then differentiate $$\dot y = \dot x\,x^Tx + x\dot x^Tx + xx^T\dot x$$ Finally, take advantage of vector properties which aid factorization $$\dot y = \Big[(x^Tx)I + 2xx^T\Big]\dot x\qquad$$

3
On

Writing a couple more parentheses might be useful. In particular, in the first solution you write, in the final line, when $2x^T\mathrm{d}x\cdot x$ is written, what is meant is $$2(x^T\mathrm{d}x)\cdot x.$$ In the expression, $x$ and $\mathrm dx$ are not multiplied together. Instead, you get $x$, multiplied by the scalar $x^T\mathrm{d}x$.

0
On

You can $\mathbf{y} = = \| \mathbf{x} \|^2 \mathbf{x}$ , a scalar times a vector. Using product rule, you have $$d\mathbf{y}=(d\| \mathbf{x} \|^2)\mathbf{x}+\| \mathbf{x} \|^2 d\mathbf{x}$$ It is easy to see that $d\| \mathbf{x} \|^2 = 2 \mathbf{x}^T d\mathbf{x}$

Finally it holds $$ d\mathbf{y} =2\mathbf{x}(\mathbf{x}^T d\mathbf{x})+\| \mathbf{x} \|^2 d\mathbf{x} =\left[2\mathbf{x}\mathbf{x}^T+\| \mathbf{x} \|^2 \mathbf{I} \right] d\mathbf{x} $$ The bracket term is the Jacobian matrix you are looking at