In my machine learning course I'm asked to proof that
$MSE(w) = \frac{1}{N} \sum_{n=1}^{N}[y_n - f(x_n)]^2$
when $f$ is of the form $w_0 + x_{n1} w_1 + \cdots + x_{nD} w_D$ is a convex function on the parameters $w_1,\cdots, w_D$.
I'm a bit confused on how to do it though. My approach goes as follows:
I first proof that the sum of two convex functions is convex. Then I try to develop the value of $(y_n - f(x_n))^2$ however I arrive to the point where I have to proof that a function of type $w_iw_j$ is convex which is not true as shown in this post.
How can I proceed?
Hint: It may be easier if you write the MSE in matrix-vector form as $$ \operatorname*{MSE}(w) = \frac{1}{N}\|y - Xw\|^2. $$ From here you can show that $$ \operatorname*{MSE}(w) + \nabla\operatorname*{MSE}(w)\cdot(z - w) \leq \operatorname*{MSE}(z) $$ for all $w,z\in\mathbb{R}^{D+1}$.