If $\mathbf{A}$ is symmetric, my function $f:\mathbb{R}^{d}\mapsto \mathbb{R}^{d}$ is defined as $$ f(\mathbf{x}) = \mathbf{Ax}(\mathbf{x}^{T}\mathbf{Ax}). $$ What is the differentiation of $f$ with respect to $\mathbf{x}$, i.e., $\nabla_{\mathbf{x}}f(\mathbf{x})$?
This is not a homework problem but rather something related to my research and I converted the notations so that it is more legible. The function $f$ itself is already a gradient of some vector function that maps to a scalar and I initially assumed the Hessian of the original function would be positive definite but my code keeps firing errors at me. Without proper education in matrix calculus, the answer I came up with is $$ \nabla_{\mathbf{x}}f(\mathbf{x}) = \mathbf{A}(\mathbf{x}^{T}\mathbf{Ax}) + 2\mathbf{Axx}^{T}\mathbf{A} $$ Is this correct?
In the pen-and-paper sense, your Jacobian matrix is correct. I presume that the code gives you errors due to the fact that it interprets the arrowed $*$ in $$A\stackrel\downarrow*(x^t*A*x)$$ as a product of matrices, rather than as the multiplication matrix-by-scalar. Thus it checks the dimensions, it sees that you are trying to multiply a $(d\times d)$ vector by a $(1\times 1)$ vector, and it concludes that you are making a syntax error. We humans implicitly assume $$A(x^tAx)=A\cdot (x^t*A*x)$$
Where $*$ is the map which makes the product of an $(n\times k)$ and a $(k\times m)$ matrix to obtain a $(n\times m)$ matrix, and $\cdot$ is the map which assigns to a $(n\times m)$ matrix and a scalar number (id est, a $(1\times1)$ matrix) the appropriate thing. However, as justified and useful as it is, it remains an inconsistent use (actually, non-use) of the notation.
The machine may not do such thing.
To see it more clearly, notice what happens with the calculation \begin{align}f(x+h)-f(x)&=A(x+h)((x+h)^tA(x+h))-Ax(x^tAx)=\\&=Ax(h^tAx)+Ax(x^tAh)+Ah(x^tAx)+o(\lvert h\rvert)=\\&= 2Axx^tAh+Ah(x^tAx)+o(\lvert h\rvert)\end{align}
Notice that there is no problem in writing $A*x*\alpha$, because $A:(d\times d)$, $x:(d\times 1)$ and $\alpha:(1\times1)$.
The calculation above shows that the differential $D_xf$ is actually the map $D_xf(h)= 2Axx^tAh+Ah(x^tAx)$. You could write it like that if your purpose is evaluating it.
However, if you want a matrix $\nabla_xf$ such that $\nabla_xfh=D_xf(h)$, you can obtain it with the identity $h*\alpha=(\alpha\cdot I)*h$, where $I$ is the $d\times d$ identity matrix, so that $$\nabla_xf=2*A*x*x^t*A+A*((x^t*A*x)\cdot I)$$ How you produce the appropriate scalar multiple of the identity matrix might depend on the programming language, but there should be several options, once you know that the problem is there.
Another way might be writing $((x^t*A*x)*A)$, because some programming languages have only the scalar-by-matrix product overloaded into the symbol $*$, rather than the matrix-by-scalar product.