Are both objective functions really equivalent?

37 Views Asked by At

Suppose that $X_1, X_2, \dots, X_n$ form an iid sample drawn from some probability distribution with unknown density $g$. In Theorem 3.1 in [1] it is shown that minimization of $$A_0(g) = -n^{-1}\sum_{i=1}^ng(X_i)$$ subject to $$\int\exp(g(u))\,\mathrm du = 1$$ is equivalent to minimization of $$A(g) = -n^{-1}\sum_{i=1}^ng(X_i) + \int\exp(g(u))\,\mathrm du.$$ Both objective functions are minimized with respect to $g$ over a suitable class of functions $\mathcal S$.

To prove the claim, an auxiliary function $g^* = g - \log\left(\int\exp(g(u))\,\mathrm du\right)$ is constructed. Then, since $$A(g^*) = A(g) + 1 - \int\exp(g(u))\,\mathrm du+ \log\left(\int\exp(g(u))\,\mathrm du\right),$$ it follows that $A(g^*)\leq A(g)$ by the the inequality $x\leq\exp(x)$. The proof then concludes that if $\hat g$ minimizes $A(g)$ it necessarily has to satisfy $$\int\exp(g(u))\,\mathrm du = 1,$$ which makes both objective functions equivalent.

I don't understand the last step. Why does it suffice to consider this auxiliary function in order to prove the claim? I tried to use a Lagrangian approach for the first objective and took $\mathcal S$ to be the set of all polynomials. Both objectives led to different results. So is this theorem even true?

[1] https://projecteuclid.org/journals/annals-of-statistics/volume-10/issue-3/On-the-Estimation-of-a-Probability-Density-Function-by-the/10.1214/aos/1176345872.full

1

There are 1 best solutions below

6
On BEST ANSWER

$\hat{g}$ must satisfy $A((\hat{g})^*)=A(\hat{g})$ because we know $A((\hat{g})^*) \le A(\hat{g})$ and because we know $\hat{g}$ minimizes $A$.

If you know $A$ has a unique minimizer, then you can conclude $(\hat{g})^*=\hat{g}$, in which case $\int e^{\hat{g}(u)} \, du = 1$.

Otherwise, the best you can say is "there exists a minimizer of $A$, call it $\tilde{g}$, that satisfies $\int e^{\tilde{g}(u)} \, du = 1$," which you can obtain by taking any minimizer $\hat{g}$ of $A$ and considering $\tilde{g} := (\hat{g})^*$.