Consider the following constrained maximum likelihood problem: \begin{align*} \min\limits_{\theta \in \mathbb R^d}~ & -\log p(x_{1:n};\theta) \\ {\rm s.t.} ~~& f(\theta)=0. \end{align*}
Let $F(\theta)=\nabla^\top f(\theta)$, and $U(\theta)$ be a matrix satisfies $F(\theta)U(\theta)=0$ and $U^\top(\theta)U(\theta)=I$. In addition, define $B(\theta)=U(\theta)(U^\top(\theta)FU(\theta))^{-1}U^\top(\theta)$, where $F$ is the Fisher information matrix. Then, paper [1] states that under some regularity conditions, the constrained maximum likelihood estimate $\hat \theta_n$ is asymptotically normal, i.e., \begin{equation*} \hat \theta_n \rightarrow \mathcal N(\theta^o,B(\theta^o)), \end{equation*} where $\theta^o$ is the true parameter.
Here, confusion arises for me that the support of the Gaussian distribution is the whole space of $\mathbb R^d$, while the constrained maximum likelihood estimate $\hat \theta_n$ should satisfy the constraint $f(\hat \theta_n)=0$ whose feasible set is generally not $\mathbb R^d$. So, how can we understand the asymptotic normality of the constrained maximum likelihood estimate?
Thanks!
[1] Moore, Terrence J., Brian M. Sadler, and Richard J. Kozick. "Maximum-likelihood estimation, the Cramér–Rao bound, and the method of scoring with parameter constraints." IEEE Transactions on Signal Processing 56.3 (2008): 895-908.