I was going through the derivation of linear regression. $$ Error = {y_i}^2 -2w^tx_iy_i + w^tx_ix_i^tw $$
where $y_i$ is a scalar, $x_i$ is a $n \times 1$ vector and $w$ is also a $n \times 1$ vector.
on the next step the partial derivatives wrt $w$ have been taken and shown to be: $$ \frac{d(Error)}{dw} = - 2y_ix_i +2x_i{x_i}^tw $$
I don't have a very good understanding of how differentiation works in the case of vectors, I know the first term is constant wrt to $w$, the second term is a scalar but has a $w^t$, how would we differentiate this $w^t$ term wrt $w$, and finally the third term has both $w$ and $w^t$, how would we go about differentiating this?
What rules of differentiation are being used here if any?
It's a worthwhile exercise to write out the partial with respect to a single entry of $w_i$. It's the same rules of differentiation as usual, but written using vectors and matrices. In that case it will become clear why the gradient of $x^TAx$ is $2xA$ for symmetric $A$.
Also, for the second term, note that your differentiating an inner product which is just a linear term, so you should expect to get back a constant term. The last term is a quadratic, so you'd expect the derivative to be a linear term.
You could also check out the book Matrix Cookbook for some formulas.