I have tried to get $$\frac{d}{d\vec{x}}\left[\vec{x}^T\vec{x}\right].$$
One approach is to use a component-wise example in 3D. $\begin{bmatrix}x_1 & x_2 & x_3\end{bmatrix}\cdot\begin{bmatrix}x_1 \\ x_2 \\ x_3\end{bmatrix} = x_1^2 + x_2 ^2 + x_3^2$
this derived wrt the vector $\vec{x}=\begin{bmatrix}x_1 \\ x_2 \\ x_3\end{bmatrix}$ should give $$\begin{bmatrix}\frac{\partial }{\partial x_1}(x_1^2 + x_2 ^2 +x_3^2) \\ \frac{\partial}{\partial x_2}(x_1^2 + x_2 ^2 +x_3^2)\\ \frac{\partial}{\partial x_3}(x_1^2 + x_2 ^2 +x_3^2)\end{bmatrix}=\begin{bmatrix}2x_1\\2x_2\\2x_3\end{bmatrix}$$
On the other hand, using the product rule: $$\frac{d}{d\vec{x}}\left[\vec{x}^T\vec{x}\right] = \frac{d}{d\vec{x}}\vec{x} + \vec{x}^T \frac{d}{d\vec{x}} = \vec{x}+\vec{x}^T$$ These cannot be added together because they have different dimensionalities. So what did I do wrong? And more importantly, what is the correct derivative of $\vec{x}^T\vec{x}$?
The easiest way is to use the implicit /external definition of the gradient (can be obtained by the chain rule)
$$d F=dx^T\,\nabla F.$$
EDIT: Explanation of to obtain the external definition of the gradient. Consider a function $F=F(x_1,...,x_n)$ Then the total derivative is given by
$$dF = \dfrac{\partial F}{\partial x_1}dx_1+...+\dfrac{\partial F}{\partial x_n}dx_n=dx_1\dfrac{\partial F}{\partial x_1}+...+dx_n\dfrac{\partial F}{\partial x_n}$$ $$=dx^T\begin{bmatrix}\dfrac{\partial F}{\partial x_1}\\\vdots\\\dfrac{\partial F}{\partial x_n} \end{bmatrix}=dx^T\,\nabla_\text{column} F=\nabla_\text{row}F\,dx $$
What we have to do is to determine the total derivative of your expression
$$d(x^Tx)=dx^T x+x^Tdx.$$
Note, that both expressions are scalars hence we can transpose the second one to obtain the first expression:
$$d(x^Tx)=dx^T x+dx^Tx=dx^T\left[2x\right]$$
Comparing this expression with the implicit definition of the gradient we obtain
$$\dfrac{dx^Tx}{dx^T}=\nabla \left[x^Tx \right]=2x.$$
An alternative approach is to calculate the partial derivatives
$$\dfrac{\partial \sum_{j=1}^n x_j^2}{\partial x_i}=\sum_{j=1}^n\dfrac{\partial x_j^2}{\partial x_i}=2x_i$$
and then assemble the gradient as $2x$.
Or using index notation (summation over double indices)
$$\dfrac{\partial x_jx_j}{\partial x_i}=\dfrac{\partial x_j}{\partial x_i}x_j+x_j\dfrac{\partial x_j}{x_i}=\delta_{ji}x_j+x_j\delta_{ji}=x_i+x_i=2x_i.$$
The symbol $\delta_{ij}=\delta_{ji}$ is the Kronecker delta / permutation function.