solving a constrained 2nd order Euler Lagrange equation for a pair of probability distributions

Question

solving a constrained 2nd order Euler Lagrange equation for a pair of probability distributions

754 Views Asked by Bumbble Comm At 27 Mar 2026 - 9:09

I am interested in the following constrained optimization problem:

$$\text{Minimize }J_p(F, G) = \int_0^1 \left(\frac{F}{f} + \frac{1-G}{g}\right)^{-1} dx.$$

over a pair of cumulative distributions functions $F$ and $G$, with $f := F'$ and $g := G'$, subject to the following constraints:

Their Jensen Shannon divergence is given by a fixed constant, that is, $$\text{JSD}(F, G) := -\frac{1}{2}\int_0^1 f \log \frac{2f}{f+g} + g \log \frac{2g}{f+g} dx = C.$$
The Radon-Nikodym derivative is non-decreasing $$\left(\frac{g}{f}\right)' \ge 0.$$
$F, G$ are indeed cumulative distribution functions $$f, g \geq 0; \qquad 0 \le F, G \le 1.$$

This seems like a pretty standard calculus of variation problem, and I indeed made some progress deriving its associated Euler Lagrange equation, which is second order due to the inequality constraint 2.

First I let $H = 1-G$, then the objective function becomes $J_p(F, H) = \int_0^1 \left(\frac{F}{f} - \frac{H}{h}\right)^{-1} dx$. The constraints also change accordingly:

$\frac{1}{2} \int_0^1 \left(f \log \frac{2f}{f-h} - h \log\frac{2h}{h-f} \right) dx =C$.
$\left(\frac{h}{f}\right)' \le 0$.
$f, -h \ge 0; \qquad 0 \le F, -H \le 1$.

One can then take the functional derivative (with respect to $F$ and $H$ respectively) of the Lagrangian $$ \mathcal{L} := \int_0^1 [\left(\frac{F(x)}{f(x)} - \frac{H(x)}{h(x)}\right)^{-1} + \frac{\lambda_1}{2} \left(f(x) \log \frac{2f(x)}{f(x)-h(x)} - h(x) \log\frac{2h(x)}{h(x)-f(x)} \right) - C] dx \\ - \frac{\lambda_2(x) (h'(x) f(x) - h(x)f'(x))}{f(x)^2} + \lambda_3(x)f(x) - \lambda_4(x) h(x) + \lambda_5(x) F(x) + \lambda_6(x) (1 - F(x)) + \lambda_7(x) H(x) + \lambda_8(x) (1 - H(x)).$$

Let $\mathcal{D}_F := \frac{\partial}{\partial F} - \frac{d}{dx} \frac{\partial}{\partial f} + \frac{d^2}{dx^2} \frac{\partial}{\partial f'}$ be the Euler Lagrange operator for $F$ and similarly define $\mathcal{D}_H$.

[Update: below computation is flawed; see my self-answer below, which relies on mathematica more exclusively, without any hand derivation. The result makes a lot of sense when plotted]

I was able to simplify the resulting system of 2 ODEs significantly, thanks to mathematica, as follows:

$$ \mathcal{D}_F \mathcal{L} = [(f+h) B - A] U + [(h' \lambda_2)' + (h \lambda_2)''] + \lambda_3' + \lambda_5' - \lambda_6' = 0 \\ \mathcal{D}_H \mathcal{L} = [(f-h) B + A] V - [(f' \lambda_2)' + (f \lambda_2)''] - \lambda_4' + \lambda_7' - \lambda_8' = 0,$$

where $A := \lambda_1 (fh' - hf')(Hf - Fh)^3$, $B := 4fh[(Hf - Fh)fh + FH(fh' - hf')]$, $U, V$ are some rational functions of $F, H, f, h$.

Here I took the liberty of rewriting constraint 2 as $h' f - h f' \le 0$, since the denominator is assumed to be nonnegative; I am not completely sure if this is correct when $f(x) = 0$, but including the denominator doesn't seem to make much difference.

Now with the complementary slackness condition on constraint 2, we know that whenever $h'f - hf' \neq 0$, $\lambda_2$ has to be $0$, and vice versa. Similarly, $f \lambda_3 \equiv 0 \equiv h \lambda_4$.

One trivial critical solution of the original problem is $F = G$, which corresponds to $h'f - hf' \equiv 0$. This gives the global maximum, and satisfies only the JSD constraint with $C = 0$. I am interested in the nontrivial minimum solutions.

Now here is my trouble. As soon as I look at a point $x \in [0, 1]$ where $f, h, F, H, (1-F), (1-H) \neq 0$ (which implies $\lambda_i' = 0$, for $3 \le i \le 8$, if we assume they are $C^1$ at $x$), and $h'f - h f' \neq 0$, I am forced to have $Hf - Fh = 0$, which in turn forces me to have $h'f - hf' = 0$! So somehow I cannot escape from the condition that leads to the trivial solution. Where did I mess up? Does this problem require digging deeper into solutions with isolated points of non-differentiability?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 11 May 2017 - 6:33

Looks like I made a huge computational mistake. By letting mathematica compute the functional derivatives (not just the ordinary ones), the Euler Lagrange equations simplify to a single 2nd order ODE (so the functional derivative with respect to F and H have identical numerators!):

$$a^3 b \lambda_1 = 4(f-h)h(abF - bF^2 h + af^2 h),$$ where $a = Fh - fH$ and $b = fh' - f'h$. See this post, where I specialized to $F(x) = x$. I am ignoring the other inequality constraints, since I am looking for interior solutions, where the corresponding Lagrange multipliers would be $0$. The equation above seems to admit no closed form solutions, but mathematica numerical solutions look very sensible in terms of the original constrained minimization task when plugged in.

Update (some mathematica computations, where I set all the inequality constraint multipliers to zero since I am interested in interior solutions):

f[x_] := F'[x]; h[x_] := H'[x];

Lp[x_] := 1/(F[x]/f[x] - H[x]/h[x]);

Lj[x_] := -1/ 2 (f[x] Log[2 f[x]/(f[x] - h[x])] - h[x] Log[2 h[x]/(h[x] - f[x])]);

dF[x_] = (D[#, F[x]] - D[D[#, f[x]], x] + D[D[#, f'[x]], {x, 2}]) &

dH[x_] = (D[#, H[x]] - D[D[#, h[x]], x] + D[D[#, h'[x]], {x, 2}]) &

L[x_] := Lp[x] + r Lj[x] - r2[x] D[h[x]/f[x], x] + r3[x] f[x] -
r4[x] h[x] + r5[x] F[x] + r6[x] (1 - F[x]) + r7[x] H[x] + r8[x] (1 - H[x]);

qF[x_] := Simplify[dF[x][L[x]], {a == F[x] h[x] - H[x] f[x], b == f[x] D[h[x], x] - h[x] D[f[x], x]}];

qH[x_] := Simplify[dH[x][L[x]], {a == F[x] h[x] - H[x] f[x], b == f[x] D[h[x], x] - h[x] D[f[x], x]}];

numH[x_] := FullSimplify[ Numerator[ qH[x] /. {r2[x] -> 0, r3[x] -> 0, r4[x] -> 0, r5[x] -> 0, r6[x] -> 0, r7[x] -> 0, r8[x] -> 0, r2'[x] -> 0, r3'[x] -> 0, r4'[x] -> 0, r2''[x] -> 0}]];

numF[x_] := FullSimplify[ Numerator[ qF[x] /. {r2[x] -> 0, r3[x] -> 0, r4[x] -> 0, r5[x] -> 0, r6[x] -> 0, r7[x] -> 0, r8[x] -> 0, r2'[x] -> 0, r3'[x] -> 0, r4'[x] -> 0, r2''[x] -> 0}]];

numF[x]=$F'(x)^2 \left(a^3 b r-4 H'(x) \left(F'(x)-H'(x)\right) \left(a b F(x)+a F'(x)^2 H'(x)-b F(x)^2 H'(x)\right)\right).$

As mentioned before, numH[x] is actually a multiple of numF[x]. This means curiously that whenever $H$ is a critical point for a fixed $F$, that $F$ also happens be a critical point for the said $H$! I can't intuit at the moment why this must be the case, but I think there is an easy way to see this.

Now setting $F(x) = x$, I can solve for $H(x)$:

sol[r_] := NDSolve[{H''[ x] == (4 (1 - H'[x]) H'[ x] (x H'[x]^2 - H[x] H'[x]))/(r (x H'[x] - H[x])^3 + 4 (1 - H'[x]) H'[x] H[x] x), H[0] == 1, H1 == 0}, H, {x, 0, 1}];

This is the solution curves ($H$, $H'$, and $H''$) for $r:=\lambda = -1.43$:

$Solution for $\lambda = -1.43$$

The solution will have a singularity in its second derivative as $\lambda$ gets larger.

Here are the values of the original functionals we are trying to minimize, plotted against the Jensen Shannon divergence constraint. I have no idea how to plot the curve to the left of 0.4 because of the second derivative singularity mentioned above.

Finally I would like to prove that the curve above computed by mathematica indeed gives the global minimum solutions. I think I need to show my original functional is convex, not only among differentiable/smooth solutions. It can only be convex under the constraint $(h/f)' \leq 0$, and not more generally.

**Bumbble Comm** · Accepted Answer

Some notes

The problem requires the correct using of constrains. If there were no integrals both the goal function and conditions in Lagrangian, this would be an ordinary problem for the conditional extremum and the task function obtained would correspond to the requirement of simultaneous observance of contradictory conditions of the form $$F(x) = 0,\quad F(x) = 1$$ and similar with $G(x).$

It is known that the greatest value of a function is achieved either at one of its stationary points or at the boundary of the region. The same approach should be used in this case.

Thus, the constraints of the problem should not be taken into account in the search for a global extremum, but each of them must have a separate "boundary" problem for the extremum. And, probably, we should start with the task of a global extremum.

Simplified statement of the problem

Simplified statement of the problem is nesessary for mathematical tools checking.

Let us maximize $J_p(F, G)$ on condition $JSD(F, G) = C,\,$ or $$\int\limits_0^1\left((f+g)\log (f+g) - f\log 2f - g\log 2g\right)\,dx = 2C.$$

That gives the Lagrangian in the form of

$\mathcal L(F, G) = \int_0^1 {fg\over Fg + (1-G)f}\,dx + \lambda\left(\int\limits_0^1\left((f+g)\log (f+g) - f\log 2f - g\log 2g\right)\,dx - 2C\right).$

What does mathematica give in this case?
And is it possible to set the constraints of the task as limitations, and not as mandatory conditions?

Getting of equations

To get the equations, let us equal the variations ${\delta \mathcal L(F, G) \over \delta F}, {\delta \mathcal L(F, G) \over \delta G}$ to zero: $$\begin{cases} {-fg^2\over \left(Fg + (1-G)f\right)^2} + {d\over dx}{g\left(Fg + (1-G)f\right) - fg(1-G)\over\left(Fg + (1-G)f\right)^2} + \lambda{d\over dx}\left(\log(f+g)+1-\log2f - 1\right) = 0\\ {f^2g\over \left(Fg + (1-G)f\right)^2} + {d\over dx}{f\left(Fg + (1-G)f\right) - fgF\over\left(Fg + (1-G)f\right)^2} + \lambda{d\over dx}\left(\log(f+g)+1-\log2g - 1\right) = 0, \end{cases}$$ $$\begin{cases} {-fg^2\over \left(Fg + (1-G)f\right)^2} + {d\over dx}{Fg^2\over\left(Fg + (1-G)f\right)^2} + \lambda{d\over dx}\left(\log(f+g)-\log2f\right) = 0\\ {f^2g\over \left(Fg + (1-G)f\right)^2} + {d\over dx}{(1-G)f^2\over\left(Fg + (1-G)f\right)^2} + \lambda{d\over dx}\left(\log(f+g)-\log2g\right) = 0, \end{cases}$$ $$\begin{cases} {-fg^2\over \left(Fg + (1-G)f\right)^2} + {fg^2+2Fgg'\over\left(Fg + (1-G)f\right)^2} - {2Fg^2\left(fg+Fg'-gf+(1-G)f'\right)\over\left(Fg + (1-G)f\right)^3} + \lambda\left({f'+ g'\over f+g}-{f'\over f}\right) = 0\\ {f^2g\over \left(Fg + (1-G)f\right)^2} + {-gf^2+2(1-G)ff'\over\left(Fg + (1-G)f\right)^2} - {2(1-G)f^2\left(fg + Fg' - gf + (1-G)f'\right)\over\left(Fg + (1-G)f\right)^3} + \lambda\left({f'+ g'\over f+g}-{g'\over g}\right) = 0, \end{cases}$$ $$\begin{cases} {2Fgg'\left(Fg + (1-G)f\right)\over\left(Fg + (1-G)f\right)^3} - {2Fg^2\left(Fg'+(1-G)f'\right)\over\left(Fg + (1-G)f\right)^3} + \lambda\left({f'+ g'\over f+g}-{f'\over f}\right) = 0\\ {2(1-G)ff'\left(Fg + (1-G)f\right)\over\left(Fg + (1-G)f\right)^3} - {2(1-G)f^2\left(Fg' + (1-G)f'\right)\over\left(Fg + (1-G)f\right)^3} + \lambda\left({f'+ g'\over f+g}-{g'\over g}\right) = 0, \end{cases}$$ $$\begin{cases} {2Fgg'(1-G)f-2Fg^2(1-G)f'\over\left(Fg + (1-G)f\right)^3} + \lambda{(f'+g')f-f'(f+g)\over (f+g)f} = 0\\ {2(1-G)ff'Fg - 2(1-G)f^2Fg'\over\left(Fg + (1-G)f\right)^3} + \lambda{(f'+g')g - g'(f+g)\over(f+g)g} = 0, \end{cases}$$ $$\begin{cases} {2F(1-G)g(fg'-f'g)\over\left(Fg + (1-G)f\right)^3} + \lambda{fg'-f'g\over (f+g)f} = 0\\ {2F(1-G)f(f'g-fg')\over\left(Fg + (1-G)f\right)^3} + \lambda{f'g-fg'\over (f+g)g} = 0. \end{cases}$$ Easy to see that the equations are proportional. This means that the system is overdefined and that the parameter $\lambda$ is arbitrary for the simplified task.

On the other hand, the task has, at least, the next solutions $$ F(x) = 0,\quad G(x) - \text{arbitrary};\\ G(x) = 1,\quad F(x) - \text{arbitrary};\\ F(x) = C_1,\quad G(x) = C_2;\\ F(x) = kG(x)+b. $$

solving a constrained 2nd order Euler Lagrange equation for a pair of probability distributions

There are 2 best solutions below

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in CALCULUS-OF-VARIATIONS

Related Questions in OPTIMAL-CONTROL

Related Questions in EULER-LAGRANGE-EQUATION

Related Questions in KARUSH-KUHN-TUCKER

Trending Questions

Popular # Hahtags

Popular Questions