How would the following derivative be expressed:
$$\frac{d}{d{\mathbf{x}}}\left(\mathbf{x}^T A(\mathbf{x}) \mathbf{x}\right)$$.
$\mathbf{x}\in \mathbb{R}^n$ is a vector, and $A\in \mathbb{R}^{n\times n}$ is a square matrix which is symmetric. Each element of $A$ is a function of $\mathbf{x}$ I am having trouble visualizing the result of this general case.
EDIT
I am going about this very naively following a comment below.
$$\frac{d}{d{\mathbf{x}}}\left(\mathbf{x}^T A(\mathbf{x}) \mathbf{x}\right) = \left(\frac{d}{d\mathrm{x}} \mathrm{x}^T\right)A(\mathrm{x}) \mathrm{x} + \mathrm{x}^T \frac{d}{d\mathrm{x}}(A(\mathrm{x})\mathrm{x}),$$
but then I end up with an inconsistent relationship involving one term that is a scalar:
$$\frac{d}{d{\mathbf{x}}}\left(\mathbf{x}^T A(\mathbf{x}) \mathbf{x}\right) = A(\mathrm{x}) \mathrm{x} + \mathrm{x}^T \frac{d}{d\mathrm{x}}(A(\mathrm{x}))\, \mathrm{x} + \mathrm{x}^T A(\mathrm{x})$$
where the second term in the left hand side is the scalar.
You are correct to get to
$$ \begin{aligned}y & =\boldsymbol{x}^{\intercal}{\rm A}\boldsymbol{x}\\ \tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}y & =\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left(\boldsymbol{x}^{\intercal}{\rm A}\boldsymbol{x}\right)\\ & =\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left(\boldsymbol{x}\right)^{\intercal}{\rm A}\boldsymbol{x}+\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}+\boldsymbol{x}^{\intercal}{\rm A}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left(\boldsymbol{x}\right)\\ & =\left(\boldsymbol{x}^{\intercal}{\rm A}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left(\boldsymbol{x}\right)\right)^{\intercal}+\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}+\boldsymbol{x}^{\intercal}{\rm A}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left(\boldsymbol{x}\right)\\ & =\left(\boldsymbol{x}^{\intercal}{\rm A}^{\intercal}\right)^{\intercal}+\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}+\boldsymbol{x}^{\intercal}{\rm A}\\ & ={\rm A}\boldsymbol{x}+\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}+\boldsymbol{x}^{\intercal}{\rm A} \end{aligned}$$
But you have to recognize that $\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)$ is not a matrix, but a rank-3 tensor. So $\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}$ is not a scalar, but a vector.
To understand $\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)$ consider the i-th column of ${\rm A}$ as a vector and calculate the jacobian matrix $${\rm J}_i = \frac{\partial{\rm A}_{i}}{\partial\boldsymbol{x}}$$
As a result, the i-th element of the vector $\boldsymbol{x}^{\intercal}\tfrac{{\rm d}}{{\rm d}\boldsymbol{x}}\left({\rm A}\right)\boldsymbol{x}$ is defined by the scalar $\boldsymbol{x}^{\intercal}\left(\frac{\partial{\rm A}_{i}}{\partial\boldsymbol{x}}\right)\boldsymbol{x}$