I have a problem with the invariance property of MLE who say: (cfr. Casella-Berger Statistical Inference)
"If $\hat\theta$ is the MLE of the parametre $\theta$ and $g(\cdot)$ is a $1$-to-$1$ trasformation of $\theta$, then $\widehat{g(\theta)}=g(\widehat\theta)$".
My problem is that in the proof the book defines a new maximum likelihood function for $g(\theta)$:
"If $\eta=g(\theta)$ so $L^\star(\eta\mid\underline x):=L(g^{-1}(\eta) \mid \underline x)$ and so we can verify that $\sup_\eta\{L^\star(\eta\mid\underline x)\}=\sup_\eta\{L(g^{-1}(\eta)\mid\underline x)\}=\sup_\theta\{L(\theta\mid\underline x)\}$;"
but this thing does not prove the invariance, because the invariance follows by the new definition of $L^\star$. I think that the book has to show that $L^\star(\eta\mid\underline x)=g(L(\theta\mid\underline x))$ to prove the assertion, otherwise it can't talking about MLE for $g(\theta)$.
In fact, in this page there is something like my idea https://turing.une.edu.au/~stat354/notes/node69.html