Prompt
Show that for the maximum likelihood estimator (MLE) $\theta_{MLE}$ of a parameter $\theta \in \mathbb{R}^d$ and any differentiable function $g: \mathbb{R}^d \rightarrow \mathbb{R}^k$, the MLE of $g(\theta)$ is $g(\theta_{MLE})$.
Attempt
I've been able to prove this for a bijective (one-to-one) $g$ as follows. However, I am having trouble extending this to a $g$ that is many-to-one.
Let $\beta = g(\theta)$ and assume $g$ is one-to-one.
\begin{align} \theta &= g^{-1}(\beta) \\ \theta_{ML} &= \mathcal{L}g^{-1}(\beta) \\ &= g^{-1}(\beta_{ML}) \\ \beta_{ML} &= g(\theta_{ML}) \end{align}
The above shows that the MLE of $g(\theta)$ is $g(\theta_{ML})$ for a $g$ that is one-to-one.
Notes
Please give hints or examples on how to extend this proof for a many-to-one mapping function $g$. Thank you!