The Hessian describes the curvature of the loss landscape that we're optimizing on. In particular, it describes the local quadratic of the loss landscape.
My question is: this seems like we could compute the Hessian much more efficently if we just modelled a local spline model. Why not do that?