Convergence of arguments in optimization

31 Views Asked by At

This question is motivated by non-parametric maximum likelihood estimation is statistics but I guess it applies more generally to any optimization problem.

Let $\{x_1,x_2,\dots,x_n\}$ be a data sample and $F$ be a non-parametric class of densities. In non-parametric MLE, one wants to solve

$\max_{f \in F}\ell(f):=\frac{1}{n}\sum_{i=1}^n \log f(x_i)$

Now, in general, for two topological spaces $Z$ and $Y$ and a mapping $g:X \rightarrow Y$, the point $a\in Y$ is a limit of $g$ at $x_0$, if for any neighbourhood $V$ of $a$ there exists a neighbourhood $U$ of $x_0$ such that for any $x \in U-\{x_0\}$, $g(x) \in V$.

I am trying to translate this to the MLE problem. It seems like my $g$ here should be $\ell: F \rightarrow \mathbb{R}$ because this is the function I am maximising. However, what confuses me is that in MLE problems, we are less interested in the exact value of $\ell(f)$ but really the argmax $\hat{f}$. So the convergence I am interested in is how to ensure that if there is a sequence of functions $(f_k)$ such that $\ell(f_k)\rightarrow \ell(\hat{f})$, we get that $f_k \rightarrow \hat{f}$. So it seems like what I need is some kind of inverse $\ell^{-1}$ but this is not well-defined (even if I think of $\ell$ as $\ell: F \rightarrow \mathbb{R}^n$, $\ell^{-1}$ still needs to take finitely many values and somehow translate that into a function).

I feel like I am facing a chicken-and-egg problem. What am I missing?