I am trying to think of a good motivation for maximum likelihood estimation.
Given a set of random variables $X_1, \ldots, X_n \sim f_X(x_1, \ldots, x_n |\theta)$, the maxmum likelhood estimation problem finds $\theta$ that maximizes $f_X(x_1, \ldots, x_n |\theta)$ given $x_1, \ldots x_n$.
But is a good physical analogy as to why we want to do this?
I was thinking of...you have a class room of students, you pick 10 of them and get their average height, and you assume that average height is the average height for the entire class. However, there is no maximizing in my analogy. So it doesn't work.
Let you have a large box containing black and white balls and proportion $p$ of white balls is unknown. Then you take randomly $5$ balls and it appears that $4$ of them are white. We can calculate probabilities of this event under various values of $p$. Say,
if $p=0.1$ then $P_{5}(4)=5\cdot 0.1^4 \cdot 0.9 = 0.00045$,
if $p=0.3$ then $P_{5}(4)=0.02835$,
if $p=0.6$ then $P_{5}(4)=0.2592$,
if $p=0.8$ then $P_{5}(4)=0.4096$,
if $p=0.9$ then $P_{5}(4)=0.32805$,
if $p=0.99$ then $P_{5}(4)=0.04802985$ and so on.
We have the sample which has initially different probabilities to appear for various proportions of white balls. What is the reasonable estimate of $p$? It is the value of $p$ which give us the greatest probability to obtain the sample that is already obtained. In this example it will be $\hat p=4/5=0.8$. This is maximum likelihood estimation.