From what I understand, preconditioners are a way to encode prior knowledge of the shape of the objective landscape to make it appear more circular (and thereby easier for a first-order method to solve).
Do second-order optimization methods benefit from preconditioners? I think that they don't given that they infer a local quadratic model. (And thereby don't need this additional information)
My follow up question: are there any trade-offs between using a preconditioner versus using a Hessian? They both seem to try to solve the same issue -- trying to get local curvature (one uses domain knowledge and the other computes this information).
Scaling an unconstrained problem does not affect Hessian-based optimizers. Any monotone linear transformations does not affect the Newton direction ($\nabla^2f\Delta x=-\nabla f$ and $A\nabla^2f\Delta x=-A\nabla f$ have the same solution). Scaling can help to solve poorly scaled problems with finite precision arithmetic.
In constrained optimization, interior point methods rely on the Hessian. For those, relative scaling of variables and scaling constraints does affect the course of the algorithm.