Imagine we have two functions, one linear and one polynomial:
$g_1(x) = a_1x + a_0$
$g_2(x) = b_2x^{20} + b_1x^{3} + b_0$
If we do perform regression by based on a training set $Z = \{(x_t, r_t) | t = 1...N\}$ by minimizing the empirical loss:
$E(a_1, a_0 | Z) = \frac{1}{N} \sum_{t=1}^{N} (r_t - g_1(x_t))^2$
$E(b_2, b_1, b_0 | Z) = \frac{1}{N} \sum_{t=1}^{N} (r_t - g_2(x_t))^2$.
Assume that we can obtain the optimal values of $a_1, a_0$ and $b_2, b_1, b_0$ by minimizing the corresponding empirical loss. Is it guaranteed that $E(b_2, b_1, b_0 | Z) \leq E(a_1, a_0 | Z)$ in all cases?
Starting point: I know that if $g_2(x) = b_2x^{2} + b_1x + b_0$, then it would be true (because we can set $b_2 = 0$, then it will become a linear function). However, here we have polynomial of degree 20 and 3 instead. How about this case?
Edit: add the square to the losses