Does a polynomial function always fit the data better than a linear function?

52 Views Asked by At

Imagine we have two functions, one linear and one polynomial:

$g_1(x) = a_1x + a_0$

$g_2(x) = b_2x^{20} + b_1x^{3} + b_0$

If we do perform regression by based on a training set $Z = \{(x_t, r_t) | t = 1...N\}$ by minimizing the empirical loss:

$E(a_1, a_0 | Z) = \frac{1}{N} \sum_{t=1}^{N} (r_t - g_1(x_t))^2$

$E(b_2, b_1, b_0 | Z) = \frac{1}{N} \sum_{t=1}^{N} (r_t - g_2(x_t))^2$.

Assume that we can obtain the optimal values of $a_1, a_0$ and $b_2, b_1, b_0$ by minimizing the corresponding empirical loss. Is it guaranteed that $E(b_2, b_1, b_0 | Z) \leq E(a_1, a_0 | Z)$ in all cases?

Starting point: I know that if $g_2(x) = b_2x^{2} + b_1x + b_0$, then it would be true (because we can set $b_2 = 0$, then it will become a linear function). However, here we have polynomial of degree 20 and 3 instead. How about this case?

Edit: add the square to the losses