would appreciate any hints with the proof for $x^TAx$ using index notation:
Suppose $x$ is an $n$ x 1 vector, $A$ is an $n$ x $n$ matrix. $A$ does not depend on $x$, and also $\alpha = x^TAx$.
Let $\alpha = \sum_{j=1}^{n}\sum_{i=1}^{n} x_i a_{ij} x_j$
Differentiating wrt to the $k^{th}$ element of x:
$\frac{\delta\alpha}{\delta x_k} = \sum_{i=1}^{n} x_ia_{ik} + \sum_{j=1}^{n} a_{kj}x_j$ for all $k$ = 1, ... , $n$
$\frac{\delta\alpha}{\delta \boldsymbol{x}} = x^TA + x^TA^T = x^T(A+A^T)$
Now I understand that the $\sum_{i=1}^{n} x_ia_{ik}$ component gives us $x^TA$ when we take $\frac{\delta\alpha}{\delta \boldsymbol{x}}$. This is because for each $k$ (which are the columns of $A$) we calculate a sum-product using the vector $x$ and the $k^{th }$ column of $A$. But how does the $\sum_{j=1}^{n} a_{kj}x_j$ component result in $x^TA^T$ and not $Ax$ in $\frac{\delta\alpha}{\delta \boldsymbol{x}}$ since we are effectively calculating the sum product of $x$ and the $k^{th}$ row of $A$?
Thank you for your time and help.
The explanation is the following: the numbers $$ \sum_{j=1}^{n} a_{kj}x_j $$ are the entries of $Ax$, which is a column. On the other hand, $$ \frac{d\alpha}{dx}=\left(\frac{\partial\alpha}{\partial x_1},\ldots,\frac{\partial\alpha}{\partial x_n}\right) $$ is a row. That is why you need to take the transpose $(Ax)^T=x^TA^T$.
Added: alternatively, note that $$ \begin{split} (x+h)^TA(x+h) &=x^TAx+x^TAh+h^TAx+h^TAh\\ &=x^TAx+(x^TA+x^TA^T)h+h^TAh, \end{split} $$ and so $$ \frac{d}{dh}(x+h)^TA(x+h)|_{h=0}=x^TA+x^TA^T. $$