I have a problem with MLE's definition:
Casella Berger in Statistical Inference and Nitis Mukhopadhyay in Probability and Statistics said that MLE for a parameter $\theta\in\Theta$ is respectively $\arg\sup_{\theta\in\Theta}\{L(\theta\mid x)\}$ or $\arg\max_{\theta\in\Theta}\{L(\theta\mid x)\}$.
But this estimator is a supremum or a maximum? If it is a supremum why we don't call it supremum likelihood estimator? Conversely if it is a maximum, the likelihood function must be continuous and the parametric space must be compact (sufficient condition for the existence of maximum)
What is the truth or the minimal condition for the existence of MLE?
Sufficient conditions for the MLE to exist are $L$ continuous in $\theta$ over a compact support (extreme value theorem).
We can built an example where the maximum in $\theta$ does not exist, but the supremum does. The main open difficulty is whether the "supremum likelihood" is meaningful...
If $U$ is uniformly distributed on $\theta=(a,b)$, where $a$ and $b$ are parameters, then the likelihood (for one observation) is $$ L(\theta)=\frac{1}{b-a} \hspace{4mm} \text{ if } a<u_i<b $$ and $0$ otherwise. Then the MLE does not exist, but the SLE does (but is not unique…). See for instance the lecture notes of Songfeng Zheng (p. 5 and 6) for a discussion: http://people.missouristate.edu/songfengzheng/Teaching/MTH541/Lecture%20notes/MLE.pdf