I'm trying to understand equation 5.11 from Lehmann and Casella's Theory of Point Estimation (2nd edition) which is presented without proof.
It states that if $I(\theta)$ represents the Fisher information that $X$ contains about the parameter $\theta$ and $\theta = h(\xi)$ where $h$ is a differentiable function, then $I^*(\xi)=I(h(\xi))*(h'(\xi))^2$ where $I^*(\xi)$ represents the Fisher information that $X$ contains about the parameter $\xi$. In this case, $\theta$ and $\xi$ are both scalars (I know this can be generalized to the case where both parameters are vectors, but I'm not worried about that part yet).
I don't see exactly why this is true. I'm guessing it has something to do with the chain rule, but I tend to get lost in the notation. I suppose we can write $I(\theta) = I(h(\xi)) = \int ( \frac{p'[f(x,h(\xi))]\frac{df}{dh(\xi)}}{p[f(x,h(\xi)]} )^2 p[f(x,h(\xi)] dx$
Note that $\frac{p'[f(x,h(\xi))]\frac{df}{dh(\xi)}}{p[f(x,h(\xi)]}$ is the derivative of the score function for density $p_{\theta}(x)$
Then $I(\theta)*(h'(\xi))^2= \int ( \frac{p'[f(x,h(\xi))]\frac{df}{dh(\xi)} \frac{dh(\xi)}{d\xi}}{p[f(x,h(\xi)]} )^2 p[f(x,h(\xi)] dx = \int ( \frac{p'[f(x,h(\xi))]\frac{df}{d\xi}}{p[f(x,h(\xi)]} )^2 p[f(x,h(\xi)] dx$
But this isn't $I^*(\xi)$. Or is it? Or is my understanding/notation just totally backward? I'd really appreciate any insight that you all may have in regards to this problem.