let $X_1, \dots, X_n$ be IID random variables with continous density $g(x|\theta_0)$, where $\theta_0 \in \Theta \subset \mathbb{R}$ and $\Theta = \{ \theta_0, \theta_1, \dots, \theta_m \}$ (the parameter space is finite, we are in a very simple case).
We define the log-likelihood as
$$ \ell_n(\theta) : = \sum_{i= 1}^n \log g(X_i | \theta)$$
Suppose we have proven that $\ell_n(\theta_0) - \ell_n(\theta) \rightarrow \infty$ almost surely. Fixed an $\epsilon > 0$ there exists a $n_0 \in \mathbb{N}$ s.t defining
$$A_j = \{ \ell_n(\theta_0) - \ell_n(\theta_j) > \epsilon , \ \forall{n} > n_0 \}$$ we have $P(A_j) > 1- \delta$ where $\delta > 0$ is arbitraty. ( this can be done since we know that $\ell_n(\theta_0) - \ell_n(\theta) \rightarrow \infty$ almost surely).
Then we have that
$$ P \left( \bigcap_{j = 1}^m A_j \right) \ge 1- \sum_{j=1}^mP \left( A_j^c \right) \ge 1- m \delta $$
we have thus shown that
$ P\{ \ell_n(\theta_0) - \ell_n(\theta_j) > \epsilon , \ \forall{j} \ne 0 , \ \forall{n} > n_0 \} \ge 1- m \delta \tag{1}$
it is then apparently immediate (and here is my problem) that the maximim likelihood estimator $ \hat{\theta} : = \arg \max \ell_n(\theta)$ converges almost surely to $\theta_0$.
Why is this last step "obvious"?
My attempt:
In particular $(1)$ implies that $ P\{ \ell_n(\theta_0) - \max_{\theta} \ell_n(\theta) > \epsilon , \ \forall{n} > n_0 \} \ge 1- m \delta$
and somehow we should get that $\ell_n ( \hat{\theta}) $ converges almost surely to $\ell(\theta_0)$ that should give us that $\hat{\theta} \rightarrow \theta_0$ .
Observe that $\bigcap_{j=1}^mA_j\subset\bigcap_{n\gt n_0}\{\operatorname{argmax}\ell_n(\theta)\neq \theta_0\}$
hence taking the probabilities and looking at the complements, we showed that $$ \Pr\left(\bigcup_{n\gt n_0}\{\operatorname{argmax}\ell_n(\theta)\neq \theta_0\}\right)\leqslant m\delta. $$