In maximum likelihood estimation, why is it hard to directly optimize the likelihood function?

212 Views Asked by At

In Boyd's Chapter 7, it writes

enter image description here

I am just wondering what is the reason we do not maximize the likelihood function directly and instead construts the log-likelihood function?

What is the fundamental reason that makes the product of densities harder to maximize? Is it because it is difficult to test convexity, generate gradient, or something else?

1

There are 1 best solutions below

1
On

In small-$n$ problems, optimizing the likelihood may be tractable, and is in practice sometimes done. However optimizing a likelihood function that involves the product of many terms (for instance $n \sim 10^8$) is computationally difficult because you must take derivatives of extremely high powers of terms and cross terms. It is much simpler to optimize a sum of these terms.