Consider two curves $\hat{g}_1$ and $\hat{g}_2$ defined by
$$ \hat{g}_1 = \operatorname*{argmin}_g \left((y_i - g(x_i))^2 + \lambda \int [g'(x)]^2 dx \right)$$
$$ \hat{g}_2 = \operatorname*{argmin}_g \left((y_i - g(x_i))^2 + \lambda \int [g''(x)]^2 dx \right)$$
As $\lambda$ gets larger, which curve $\hat{g}_i$ has the smaller training SSE? How about for testing SSE?
(Asked this also on the stats stackexchange, but no reply.)
It's impossible to say in general, it depends on the way the data are generated. They both describe a form of spline regression of the observations and the parameter $\lambda$ tunes the bias-variance tradeoff.