Problem with distribution, maximum likehood estimation

45 Views Asked by At

I refer to https://brilliant.org/wiki/maximum-likelihood-estimation-mle/. We got definition: $$L=f(x_1\mid\theta)f(x_2\mid\theta)\ldots f(x_n\mid\theta)$$ Shouldn't be there $f_0$ instead of $f$? It doesn't make sense since $f$ is defined as family of distributions that depends from some parameters.

1

There are 1 best solutions below

0
On

This is an example of parametric estimation. Since entire functions are difficult to estimate, we often paramterize a family $f[x_i|\theta]$, where $\theta \in \Theta$. We typically assume the true data generating process corresponds to $\theta_0 \in \Theta$. We'd like it if our approach to estimation is unbiased, so our estimator $\hat{\theta}(X)$ using the data $X$ satisfies $\mathbb{E}_X[\hat{\theta}(X)] = \theta_0$, or is consistent, so that $\lim_{n \rightarrow \infty} \hat{\theta}(X_n)= \theta_0$ in some meaningful sense (usually almost surely or in probability). The major frameworks for deriving results in this family are maximum likelihood, method of moments or generalized method of moments, and $M$-estimation.

In contrast, if you want to do non-parametric estimation, you are using the data to estimate $f(x)$ directly. That includes, for example, histograms or kernel density estimation. So the goal is to get a consistent estimate of an entire function $f(x)$, $\hat{f}(x)$, as your data get large, rather than limit yourself to parameters in $\mathbb{R}^N$.

Then there is semi-parametric estimation, where you typically model the mean $m(x_i,\theta)$ but also estimate the density of error terms $\varepsilon_i$, so the model is $m(x_i,\theta) + \varepsilon_i$ where $f(\varepsilon)$ is to be estimated as well. And many more.

But when you're wondering the limits of the exercise you're doing right now, sometimes it's nice to look ahead and contrast it to what else is out there.