I am trying to calculate the derivative of $x^{T}x$ where x is a column vector.
A correct way of doing this is shown in this formula
However, I am getting different results with the product rule:
$\frac{d(x^{T}x)}{dx}=x^{T}*\frac{dx}{dx}+\frac{d(x^{T})}{dx}*x = x^T + x \ \ (\neq 2x^{T})$
(I used this formula in Leibniz notation from Wikipedia)
The problem is probably that it is a dot product and not a regular product.
So my question is: how do I apply the product rule for dot products correctly?
It would be more helpful if they called it the gradient since we are talking about multivariable function: $$ f:\mathbb R^m\longrightarrow\mathbb R $$ given by $f(x)=x^T x$ where $x$ is an $m\times 1$ column vector. The gradient actually consists of $m\times 1$ derivatives, namely describing the rate of change in each coordinate of the input (the partial derivatives) with respect to each coordinate in the output (which is just one here, since the output is 1-dimensional). So the gradient is given by: $$ \frac{df}{dx}=\nabla f=\left(\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2},...,\frac{\partial f}{\partial x_m}\right) $$ So you have to determine derivatives for each of the $m$ input-coordinates and stack them together in an $1\times m$ row vector.