So I (sort of) understand that Fisher information is $I(\theta) = -E_{\theta}[\frac{d^2}{d\theta^2} log f(x|\theta)]$, but what I'm confused by is why we bother taking the expectation with respect to $\theta$, instead of just evaluating $\frac{d^2}{d\theta^2} log f(x|\theta)$ at $\theta = \hat{\theta}_{mle}$.
Wouldn't this point evaluation give us a much better sense of how sharply peaked the log-likelihood is at its maximum, rather than taking the whole expectation? I worry that we can have a situation where there are many regions $\theta =/=\hat{\theta}_{mle}$ where the second derivative is very large, which could obsure curvature being very low at $\theta=\hat{\theta}_{mle}$.