I am reading about Ridge Regression in Machine Learning (in particular, the calculation of the empirical risk w.r.t. the square loss function) and do not understand the following step:
$$\frac{1}{n}\sum_{i=1}^{n}(\langle\vec{x}_i,\vec{w}\rangle-y_i)^2 = \frac{1}{n}(X\vec{w}-\vec{y}_i)^T (X\vec{w}-\vec{y}_i)$$
I tried to calculate the left side and got $$ \frac{1}{n}\sum_{i=1}^{n}(\langle\vec{x}_i,\vec{w}\rangle)^2-2(\langle\vec{x}_i,\vec{w}\rangle)\vec{y}_i+\vec{y}_i^2$$
But now I have no idea how to continue and I am not sure whether I am on the right track or not!
There is no need to expand the LHS. Just note that $$X\vec{w} = (\langle \vec{x}_1, \vec{w}\rangle, \ldots, (\langle \vec{x}_n, \vec{w}\rangle)^T\\ z^Tz = \langle z, z \rangle = \sum \limits_{i = 1}^n z_i z_i = \sum \limits_{i = 1}^n z_i^2$$ where $\vec{x}_i$ are the row vectors of $X$.