Does partial maximization/minimization always give the the same result as global optimization?

65 Views Asked by At

Suppose I want to optimize the function,f(x), where x is a vector. x can be seperated into two sub-vectors, $(x)=(x_{1},x_{2})$.

I can first partially optimize $x_{1}$, and treat $x_{2}$ as constant, so that I get the optimal $x^{*}_{1}$ as a function of $x_{2}$.Then I can optimize the object function as a function of $x_{2}, i.e. f(x^{*}_{1}(x_{2}),x_{2})$.

My question is does this always give the same result, as I directly optimize f(x) over x? Can this be proved?

The question arised from the maximum likelihood estimation of normal distribution, where $\sigma$ and $\mu$ are all unknown. We always calculate $\hat{\mu}$ first, then express $\hat{\sigma}$ as a function of $\hat{\mu}$. I want to know does this procedure can be applied to any kind of problem?