Suppose that we have a parameter space $\Theta$ that is NOT compact.
The M-estimator is defined to be $\widehat{\theta}_{n}$ which maximizes $M_{n}\left(\theta\right)=\sum_{i=1}^{n}m_{\theta}\left(X_{i}\right)$ and $\theta^{*}$ maximizes $M\left(\theta\right)=\mathbb{E}\left[m_{\theta}\left(X\right)\right]$, for some functions $m_{\theta}$ and $X_{1},\ldots,X_{n}$ random variables i.i.d. from a pdf $f$.
The expectation $\mathbb{E}$ is with respect to $f$. Assume that there exists a compact set $S\in \Theta$ such that $\theta^{*}\in X$ and
\begin{equation} \mathbb{E}\left[\sup_{\theta\in\mathcal{\Theta}\cap S^{c}}m_{\theta}\left(X\right)\right]<M\left(\theta^{*}\right).\label{eq:lessinexpect} \end{equation} How can we show that almost surely, $\widehat{\theta}_{n}$ is in the compact set $S$?
I am maybe 6 years late. But since I have encountered the same problem recently, would post my solution for future reference.
Let $A_n = \{\omega\in \Omega, \hat{\theta}_n(\omega)\in \mathcal{H}\cap K^c\}$. Define \begin{align*} A = \limsup A_n \end{align*} By the assumption, $Y_i$ are $i.i.d.$. If we permute finitely many index of $Y_i$, the value of $\hat{\theta_n}$ is unchanged for sufficiently large $n$. i.e. The occurrence of $A$ is unchanged by finite permutation of index of $Y_i$. By Hewitt-Savage Zero One Law \begin{align*} P(A) \in \{0,1\} \end{align*} To prove $P(A)=0$, it is sufficient to prove $P(A)\neq 1$.
For the sake of contradiction, suppose $P(A)=1$. We note that when $\hat{\theta_n}\in \mathcal{H}\cap K^c$, we have \begin{align*} \frac{1}{n}\sum_{i=1}^n m(\hat{\theta_n})\leq \frac{1}{n}\sum_{i=1}^n \sup_{\theta\in \mathcal{H}\cap K^c}m(\theta) \end{align*} This event occurs infinitely often on $A$, so we have the following \begin{align*} \liminf\left(\frac{1}{n}\sum_{i=1}^n m(\hat{\theta_n})1_{A}\right)\leq\limsup\left( \frac{1}{n}\sum_{i=1}^n \sup_{\theta\in \mathcal{H}\cap K^c}m(\theta)1_{A}\right) \end{align*} By law of large number, $P(A)=1$ and $\mathbb{E}( |\sup_{\theta\in \mathcal{H}\cap K^c}m(\theta)|)<\infty$, we have \begin{align*} \limsup\left( \frac{1}{n}\sum_{i=1}^n \sup_{\theta\in \mathcal{H}\cap K^c}m(\theta)1_{A}\right) = \mathbb{E}\sup_{\theta\in \mathcal{H}\cap K^c}m(\theta) \end{align*}
On the other hand \begin{align*} \frac{1}{n}\sum_{i=1}^n m(\theta^*)1_{A}\leq \frac{1}{n}\sum_{i=1}^n m(\hat{\theta_n})1_{A} \end{align*} By law of large number and $P(A)=1$, the LHS converges to $\mathbb{E}m(\theta^*)$. \begin{align*} \mathbb{E}(m(\theta^*))\leq \liminf \left(\frac{1}{n}\sum_{i=1}^n m(\hat{\theta_n})1_{A}\right)\leq \mathbb{E}\sup_{\theta\in \mathcal{H}\cap K^c}m(\theta) \end{align*} This contradicts \begin{align*} \mathbb{E}\sup_{\theta\in \mathcal{H}\cap K^c}m(\theta)< \mathbb{E}(m(\theta^*)) \end{align*} We conclude $P(A)=0$.
Next, apply the inequality of $\limsup$ of sets \begin{align*} \limsup P(A_n) \leq P(\limsup A_n) = P(A) = 0 \end{align*} This concludes $\lim_{n\rightarrow\infty}P(A_n)=0$.