So I'm trying to understand why the MLE of $\theta$ for $$f_X(x) =\frac{1}{\theta}, \ \ 0\leq x\leq\theta$$ intuitively. For reference, I am using Example 5 from this paper that I found online. The answer says that it is $\hat{\theta}_{MLE}= max(X_1...X_n)= x_{(n)}$
Given a parameter $\theta$, we are trying to find the most likely value of theta. So in essence we want to find the location in which the likelihood function $L$ is at its maximum. Here is $L$:
$$L(\theta) = \frac{1}{\theta^n}$$
Now, if we look at this crude plot of $L(\theta)$ below, we see that at $x_{(n)}$ $L$ is actually at its minimum value. So why do we choose $x_{(n)}$ and not $x_{(1)}$? shouldn't we want the point where our plot is at its maximum? Repeating my question again.. if we are trying to find the value of $\theta$ that is most likely, shouldn't it be the value for which the likelihood function is at its highest i.e. $x_{(1)}$?
$L(\theta)$">
The probability distribution here is the uniform distribution on $[0,\theta]$. That means that if we observe some collection of values $X_i$ from it, $\theta$ must be $\ge$ all of them. The maximum of the $X_i$ is the smallest $\theta$ that can possibly result in our observed values; since this smallest possible $\theta$ gives the highest probabilities, it's the MLE.
Short version: we don't take the minimum because the minimum is impossible.