Mean square error using linear regression gives a convex function

2.3k Views Asked by At

In my machine learning course I'm asked to proof that

$MSE(w) = \frac{1}{N} \sum_{n=1}^{N}[y_n - f(x_n)]^2$

when $f$ is of the form $w_0 + x_{n1} w_1 + \cdots + x_{nD} w_D$ is a convex function on the parameters $w_1,\cdots, w_D$.

I'm a bit confused on how to do it though. My approach goes as follows:

I first proof that the sum of two convex functions is convex. Then I try to develop the value of $(y_n - f(x_n))^2$ however I arrive to the point where I have to proof that a function of type $w_iw_j$ is convex which is not true as shown in this post.

How can I proceed?

1

There are 1 best solutions below

0
On BEST ANSWER

Hint: It may be easier if you write the MSE in matrix-vector form as $$ \operatorname*{MSE}(w) = \frac{1}{N}\|y - Xw\|^2. $$ From here you can show that $$ \operatorname*{MSE}(w) + \nabla\operatorname*{MSE}(w)\cdot(z - w) \leq \operatorname*{MSE}(z) $$ for all $w,z\in\mathbb{R}^{D+1}$.