Maximum Likelihood and Sampling/Independence

41 Views Asked by At

If I viewed the data of a unknown weighted coin that was flipped N times and I see that there are H heads and T tails, I know that the maximum likelihood estimation for the probability of heads is just H/N and likewise for tails it is T/N. This is based off the assumption that each flip/trial was independent from each other and that the same coin was used for each trial. However, say if I only selected all the trials that resulted in heads and combined that into a different dataset and used that for maximum likelihood. Still, each trial is technically independent from each other and if so I would get that the maximum likelihood probability for heads is 100%. I know intuitively that this procedure is wrong, but I'm not sure why exactly it is. In addition, what if I selected the first half of all the trials, and then used that result to compute the maximum likelihood for the probability of the coin. Would that be an "ok" thing to do as I think it would just be computing the maximum likelihood for data with half the sample size? Can someone explain intuitively what is going on here and clarify what is the difference between the two procedures?

1

There are 1 best solutions below

1
On

MLE is a data driven framework. The question it asks is what probability distribution "most likely" would have led to the data observed. Independence just makes the calculations easier. Things get a lot more complicated otherwise.

So all you know about the world is the data you feed into MLE. If you have reasons to believe that the answer you get is not the correct one, then that knowledge ("prior knowledge") should be used to "adjust" the likelihood. This adjusting is the heart of Bayesian calculations.