In the statement for estimating parameters through the Maximum Likelihood Principle (MLE), there is no mention of whether to choose a local maximum or a global maximum. (In my very limited reading so far) From the examples given in various textbooks/lecture notes, it seems that we should choose the global maximum of the likelihood function for inference. Is this correct?
The reason I am asking is because I am dealing with some data whose likelihood seems to have several maxima. The parameter space is three dimensional, so I have no intuition about the situation. In this case how do I estimate the parameters properly - do I just look for the maximum in a small part of the parameter space? (The bounds could be established through guesses based on the data, for example.)
Many, but not all, likelihood functions we usually encounter have strictly convex logarithm (i.e., they're log-concave). Consequently, they have a unique stationary point and that is the global maximum. This doesn't mean that there might be cases where the likelihood has multiple local maxima. You always look for the global maximum in MLE. Keep in mind, however, that MLE is not necessarily a good estimator for all problems and there are common and interesting cases where MLE may produce an estimate with large error.