How many real solutions does $\lambda_1 e^{y - \lambda_1 e^y} (1 - \lambda_1 e^y ) + \lambda_2 e^{y - \lambda_2 e^y} (1 - \lambda_2 e^y ) = 0$ have?

430 Views Asked by At

I am trying to work out for what $\lambda_1, \lambda_2 > 0$ is it true that $f(y) = \lambda_1 e^{y-\lambda_1 e^y} + \lambda_2 e^{y-\lambda_2 e^y}$ is unimodal?

Experimentally it seems it is unimodal when $\lambda_1 < \lambda_2$ and $\frac{\lambda_2}{\lambda{1}} < 7.5$ .

To work this out I started with:

$$\frac{d}{dy} \left(\lambda_1 e^{y-\lambda_1 e^y} + \lambda_2 e^{y-\lambda_2 e^y} \right) = \lambda_1 e^{y - \lambda_1 e^y} (1 - \lambda_1 e^y ) + \lambda_2 e^{y - \lambda_2 e^y} (1 - \lambda_2 e^y )$$

It seems we then need to check when

$$\lambda_1 e^{y - \lambda_1 e^y} (1 - \lambda_1 e^y ) + \lambda_2 e^{y - \lambda_2 e^y} (1 - \lambda_2 e^y ) = 0$$

has more than one solution when solved for $y \in \mathbb{R}$. How can we determine the conditions under which it has different numbers of solutions?

Added:

Substituting $z = e^y$ and dividing by $e^{y-1}$ we are trying to determine how many solutions

$$ \lambda_1 e^{1-\lambda_1 z}(1-\lambda_1 z) +\lambda_2e^{1-\lambda_2 z}(1-\lambda_2 z) = 0 $$

has with $z > 0$.

Examples:

Example $\lambda_1 = 1, \lambda_2 = 7$ with only one mode (code in python):

import matplotlib.pyplot as plt
import numpy as np
def pdf_func(y, params):
    return sum([lambd*np.exp(y - lambd * np.exp(y)) for lambd in params])
params = [1, 7]
xs = np.linspace(-10,10,1000)
plt.plot(xs, [pdf_func(y, params) for y in xs])

enter image description here

Example $\lambda_1 = 1, \lambda_2 = 50$ with two modes:

enter image description here

Questions

  • How can one prove (assuming it is true) that that the number of local maxima that $f(y)$ has is either 1 or 2 and there are no other possibilities?
  • Is it true that for $\lambda_2 > \lambda_1 > 0$, there exists a threshold $c$ so that if $\frac{\lambda_2}{\lambda_1} < c$ then $f(y)$ is unimodal and if not it has two local maxima? (My guess is that the answer is yes and this threshold is around $7.5$.)
3

There are 3 best solutions below

7
On

Consider

$$ f(z) = \lambda _1 z e^{-\lambda _1 z}+\lambda _2 z e^{-\lambda _2 z} $$

with $z = e^y$

and now

$$ f'(z) = -\lambda _1 e^{-\lambda _1 z} \left(\lambda _1 z-1\right)-\lambda _2 e^{-\lambda _2 z} \left(\lambda _2 z-1\right)=0 $$

or

$$ \frac{\lambda_2^2}{\lambda_1^2}e^{(\lambda_1-\lambda_2)z}+\frac{z-\frac{1}{\lambda_2}}{z-\frac{1}{\lambda_1}} = 0 $$

here

$$ \frac{\lambda_2^2}{\lambda_1^2}e^{(\lambda_1-\lambda_2)z}\gt 0 $$

so the zero's location is associated to the sign of

$$ \frac{z-\frac{1}{\lambda_2}}{z-\frac{1}{\lambda_1}} $$

so

$$ \min_i\frac{1}{\lambda_i}\le z \le \max_i\frac{1}{\lambda_i} $$

NOTE

As an example, for $\lambda_1 = 1, \lambda_2 = 50$ we have the sign change plot

enter image description here

Now the solutions should be searched into the negative interval which is $0.02\le z \le 1$. The roots here are $z =\{0.0211011,0.113856,1\}$

Now considering after $\lambda_2 = \zeta\lambda_1$

$f'(z,\zeta,\lambda_1)=\lambda _1 \zeta e^{-\lambda _1 z \zeta } \left(\lambda _1 z \zeta -1\right)+\lambda _1 e^{-\lambda _1 z} \left(\lambda _1 z-1\right)$.

or calling $y = \lambda_1 z, x = \zeta$

$$ g(x,y) = x(x y -1)e^{-y(x-1)}+y - 1 $$

or also

$$ x y +W\left(\frac{(y-1)e^{1-y}}{x}\right) - 1 = 0 $$

Here $W(\cdot)$ is the Lambert function.

Follows the plot for $g(x,y) = 0$

enter image description here

As can be observed roughly for $0\le x \le 7.5$ we have one root and for $7.5 \lt x$ we have three roots.

The determination of the passage from one root to three is made as follows. Solving numerically $g(x,y) = 0$ and $\frac{dg}{dy} = 0$ we obtain the point $(7.56628, 0.802977)$ (the black point as the intersection of $g(x,y) = 0$ in blue and $\frac{dg}{dy} = 0$ in green)

enter image description here

4
On

Define $f(x)=ae^{-ax}(1-ax)+be^{-bx}(1-bx)$ for $x>0$, where $b>a>0$.

The OP is interested in the roots of $f$.

Let $r=\frac ba$.

Lemma 1 $f$ has exactly one root in $(0,\frac2b)$.

Sketch of proof:

  • $f(0)=a+b >0$
  • $\displaystyle{f\left(\frac2b\right)<0}$
  • The derivative of $g_c(x):= ce^{-cx}(1-cx)$ is $g_c’(x)=-c^2e^{-cx}(2-cx)$. Thus, $g_c$ is decreasing on $(0,\frac2c)$, and $f=g_a+g_b$ is decreasing on $(0,\min(\frac2a,\frac2b))=(0,\frac2b)$.

By intermediate value theorem, there exists at least one zero in $(0,\frac2b)$. Monotonicity implies uniqueness of the zero.

Lemma 2 $$\text{number of zeroes in $\left[\frac2b,\infty\right)$}= \begin{cases} 0, & r<\rho \\ 1, & r=\rho \\ 2, & r>\rho \\ \end{cases} $$ where $\rho\approx 7.566$ and satisfies $$\exp\left[h(\rho)\left(1-\frac1\rho\right)\right]+\rho^2\cdot\frac{1-h(\rho)}{\rho-h(\rho)}=0,\quad h(t)=\frac12\left(1+t+\sqrt{t^2-6t+1}\right)$$

Ideas:

I have not been able to come up with a formal proof. But intuition and numerical experiments have led me to believe that the critical case occurs when the zero of $f$ is also a stationary point.

Therefore, we want to solve for $\rho=\frac ab$, given that

$$\quad f(x)=ae^{-ax}(1-ax)+be^{-bx}(1-bx)=0$$ $$\implies ae^{-ax}(1-ax)=-be^{-bx}(1-bx)\qquad{(*)}$$

$$\quad f'(x)=-a^2e^{-ax}(2-ax)-b^2e^{-bx}(2-bx)=0$$ $$\implies a^2e^{-ax}(2-ax)=-b^2e^{-bx}(2-bx)$$

Dividing the two equations give $$a\cdot\frac{2-ax}{1-ax}=b\cdot\frac{2-bx}{1-bx}$$

With a little algebra we get $$ab\cdot x^2-(a+b)x+2=0$$

By quadratic formula, $$x=\frac12\left(\frac1a+\frac1b+\sqrt{\frac1{a^2}+\frac1{b^2}-\frac6{ab}}\right) =\frac1{2b}\left(1+\rho+\sqrt{\rho^2-6\rho+1}\right)=\frac{h(\rho)}b$$

(Again, experiments tell that only the positive root should be considered.)

Hence, $ax=h(\rho)/\rho$ and $bx=h(\rho)$. Substituting this back into $(*)$, $$ae^{-h(\rho)/\rho}\left(1-\frac{h(\rho)}{\rho}\right)=-be^{-h(\rho)}(1-h(\rho))$$

or

$$\exp\left[h(\rho)\left(1-\frac1\rho\right)\right]+\rho^2\cdot\frac{1-h(\rho)}{\rho-h(\rho)}=0$$

This equation has two roots, $\rho\approx7.566$ or $\rho’=\frac1\rho\approx 0.132$. The latter is rejected as $b>a\implies \rho>1$.

Any ideas on how to turn these observations into a formal proof?


23rd October 2019 edit

I'd like to view the whole problem from a slightly different perspective. Alternatively, consider the system $$ \begin{cases} ye^{-yx}(1-yx)+ae^{-ax}(1-ax)=0 \\ y=b \\ y\ge a \end{cases} \qquad{(S)} $$

One advantage of this approach is that the increasing trend of the number of solution ($1\to 2\to 3$) is more obviously illustrated by the implicit function:

The proof may be thus simplified.

Intuition: What does this graph actually mean? Imagine that you are plotting $y=be^{-bx}(1-bx)+ae^{-ax}(1-ax)$ for a fixed $a$ and a varying $b$. Then:

  1. Take a snapshot of the x-axis of the graph.
  2. Make a small increment to $b$.
  3. Take another snapshot of the x-axis of the graph.
  4. Stack/append this snapshot on top of the previous one.
  5. Repeat.

Then you will get the graph above.

Since we are interested only in the distribution of x-intercepts (i.e. roots) of $y=be^{-bx}(1-bx)+ae^{-ax}(1-ax)$ for different values of $b$, the graph above essentially captures all the information that is of our interest.

A few more words: The implicit function has two branches, and they are separated by $x=\frac2y$. (This observation is indeed a direct consequence of Lemma 1.)

In other words, $$ \begin{cases} ye^{-yx}(1-yx)+ae^{-ax}(1-ax)=0 \\ y>\frac2x \\ y\ge a \end{cases} $$ uniquely defines a 'function', which is the parabola-like branch in the graph.

We will focus on this branch only. Call this branch $Y_a(x)$.


To prove Lemma 2, it suffices to prove the equivalent version

$$\text{The number of solutions of $Y_a(x)=b$ is } \begin{cases} 0, & r<\rho \\ 1, & r=\rho \\ 2, & r>\rho \\ \end{cases} $$

The basic steps of the proof will be:

  1. Prove $\lim_{x\to 0^+}Y_a(x)=+\infty$.
  2. Prove $\lim_{x\to (1/a)^-}Y_a(x)=+\infty$.
  3. $Y_a(x)$ has only one stationary point in $(0,\frac1a)$, and it is a minimum attained at $y=a\rho$.

When these three statements are proved, it can be shown that

  1. $Y_a(x)=b$ has no solutions when $r<\rho$, as $b$ is smaller than the minimum of $Y_a$.
  2. $Y_a(x)=b$ has exactly one solution when $r=\rho$, and the solution is at the minimum.
  3. $Y_a(x)=b$ has two solutions when $r>\rho$. Let the coordinates of the minimum be $(k,a\rho)$. Consider $Z(x)=Y_a(x)-b$

    • Since $Z(0)=+\infty>0$ and $Z(k)=a\rho-b<0$, by intermediate value theorem (IVM) $Z(x)$ has roots in $(0,k)$. Moreover, absence of local minimum in $(0,k)$ implies monotonic decrease of $Z$. Therefore, $Z(x)$ has one unqiue root in $(0,k)$.
    • Similarly, $Z(k)<0$ and $Z(\frac1a)=+\infty>0$ together with absence of local maximum in $(k,\frac1a)$ give another unqiue root in $(k,\frac1a)$.

This is the full outline of the proof. What remains is the proof of the first three statements.


Here is the proof of the first statement:

From the definition of $Y$, $$Y>\frac2x\implies\lim_{x\to 0^+}Y>\lim_{x\to 0^+}\frac2x=+\infty$$

Therefore, $Y$ diverges to $+\infty$ as $x\to0^+$.

Proof of statement 2:

Let $\phi(z)=ze^{-z}(1-z)$.

Then $\phi(xY)+\phi(ax)=0$.

Assume $Y$ is bounded in a neighbourhood of $x=\frac1a^-$.

Taking limits on both sides, $$\lim_{x\to(1/a)^-}\phi(xY)=0$$ $$\phi\left(\frac1a\lim_{x\to(1/a)^-}Y\right)=0$$ $$\implies Y\to0\text{ or }Y\to a$$

Either value does not satisfy the definition of $Y$: $Y>\frac2x$.

Hence $Y$ is not bounded near $x=\frac1a^-$. If $Y$ diverges to $-\infty$, then $\lim_{x\to(1/a)^-}\phi(xY)=-\infty\ne 0$. Hence, $Y$ diverges to $+\infty$.

The proof of statement 3 will be added soon.

3
On

The proof is not hard but long.

Let $w = \lambda_1 \mathrm{e}^y$ and $c = \frac{\lambda_2}{\lambda_1}$. Let $$g(w) = w\mathrm{e}^{-w} + cw \mathrm{e}^{-cw}.$$

We first give the following auxiliary result. The proof is given later.

Fact 1: Let $w_1 = \frac{c+1 - \sqrt{c^2-6c+1}}{2c}$ and $w_2 = \frac{c+1 + \sqrt{c^2-6c+1}}{2c}$. Then we have:

i) $g'(w_1) < 0$ for $c > 3 + 2\sqrt{2}$.

ii) $g'(w_2) = 0$ has exactly one solution on $(3+2\sqrt{2}, \infty)$, denoted by $c_0 \approx 7.566278$.

iii) $g'(w_2) \le 0$ for $3 + 2\sqrt{2} < c \le c_0$.

iv) $g'(w_2) > 0$ for $c > c_0$.

Now let us proceed. We give the results for $c \ge 1$ as follows. The proof is given later.

Lemma 1: If $1\le c \le c_0$, then $g(w)$ is unimodal on $(0, \infty)$. If $c > c_0$, then $g(w)$ has exactly two local maxima on $(0, \infty)$.

For $0 < c \le 1$, simply let $c_1 = \frac{1}{c}$ and $w_1 = cw$ to get $w\mathrm{e}^{-w} + cw\mathrm{e}^{-cw} = c_1w_1\mathrm{e}^{-c_1w_1} + w_1\mathrm{e}^{w_1}$. Thus, we immediately have the following results:

Lemma 2: If $\frac{1}{c_0} \le c \le 1$, then $g(w)$ is unimodal on $(0, \infty)$. If $0 < c < \frac{1}{c_0}$, then $g(w)$ has exactly two local maxima on $(0, \infty)$.

$\phantom{2}$

Proof of Fact 1: Note that $\frac{1}{c} < w_1 < w_2 < 1$. We have \begin{align} g'(w_1) &= \mathrm{e}^{-w_1}c(cw_1-1)\big(\frac{1-w_1}{c(cw_1-1)} - \mathrm{e}^{(1-c)w_1}\big), \\ g'(w_2) &= \mathrm{e}^{-w_2}c(cw_2-1)\big(\frac{1-w_2}{c(cw_2-1)} - \mathrm{e}^{(1-c)w_2}\big). \end{align} Let \begin{align} h(w_1) &= \ln \frac{1-w_1}{c(cw_1-1)} - (1-c)w_1, \\ h(w_2) &= \ln \frac{1-w_2}{c(cw_1-1)} - (1-c)w_1. \end{align}

For $c\in (3+2\sqrt{2}, \infty)$, let $t = c + \sqrt{c^2-6c+1}$, then we have $t \in ( 3 + 2\sqrt{2}, \infty)$ and $c = \frac{t^2-1}{2t-6}$. With this substitution (actually the so-called Euler substitution), we have \begin{align} h(w_1) &= \ln \frac{2(t-3)^3}{(t+1)^3(t-1)} + \frac{2(t^2-2t+5)}{(t+1)(t-3)} \triangleq h_1(t), \\ h(w_2) &= \ln\frac{8(t-3)}{(t-1)^3(t+1)} + \frac{t^2-2t+5}{2t-2} \triangleq h_2(t). \end{align} We have \begin{align} h_1'(t) &= -\frac{(t^2-6t+1)(t^2-10t+5)}{(t-3)^2(t-1)(t+1)^2}, \\ h_2'(t) &= \frac{(t^2-6t+1)(t^2-4t-1)}{2(t-1)^2(t+1)(t-3)}. \end{align} It is easy to prove that $h_1'(t) > 0$ for $t\in (3+2\sqrt{2}, 5+2\sqrt{5})$ and $h_1'(t) < 0$ for $t\in (5+2\sqrt{5}, \infty)$. Thus, $h_1(t)$ is strictly increasing on $(3+2\sqrt{2}, 5+2\sqrt{5})$ and strictly decreasing on $(5+2\sqrt{5}, \infty)$. Note that $h_1(5+2\sqrt{5}) < 0$. Thus, we have $h_1(t) < 0$ for $t\in (3+2\sqrt{2}, \infty)$. Thus, $g'(w_1) < 0$ for $w\in (3 + 2\sqrt{2}, \infty)$.

Also, it is easy to prove that $h_2'(t) > 0$ for $t\in (3 + 2\sqrt{2}, \infty)$. Thus, $h_2(t)$ is strictly increasing on $(3 + 2\sqrt{2}, \infty)$. Note also that $h_2(3+2\sqrt{2}) < 0$ and $h_2(\infty) = \infty$. Thus, $h_2(t)=0$ has exactly one solution on $(3 + 2\sqrt{2}, \infty)$, \ denoted by $t_0 \approx 11.15109339$. Also, $h_2(t) < 0$ for $t\in (3+2\sqrt{2}, t_0)$ and $h_2(t) > 0$ for $t\in (t_0, \infty)$. Let $c_0 = \frac{t_0^2-1}{2t_0-6}\approx 7.566278$. We have $g'(w_2) < 0$ for $3 + 2\sqrt{2} < c < c_0$, and $g'(w_2) > 0$ for $c > c_0$. The desired result follows. This completes the proof of Fact 1.

$\phantom{2}$

Proof of Lemma 1: If $c = 1$, we have $g(w) = 2w\mathrm{e}^{-w}$ which is unimodal on $(0, \infty)$.

In the following, assume that $c > 1$. We have $$g'(w) = \mathrm{e}^{-w}\big(1 - w - c(cw-1)\mathrm{e}^{(1-c)w}\big).$$ Clearly, $g'(w) = 0, \ w\in (0, \infty)$ is equivalent to $$\ln \frac{1-w}{c(cw-1)} = (1-c)w, \quad \frac{1}{c} < w < 1.$$ Let $$h(w) = \ln \frac{1-w}{c(cw-1)} - (1-c)w.$$ We have $$h'(w) = \frac{-(c-1)(cw^2 - (c+1)w + 2)}{(cw-1)(1-w)}.$$ There are two possible cases:

1) If $1 < c \le 3+2\sqrt{2}$, we have $cw^2 - (1+c)w + 2 \ge 0$ for $w\in (-\infty, \infty)$, with equality only if $c = 3 + 2\sqrt{2}$ and $w = 2-\sqrt{2}$. Thus, $h'(w)\le 0$ for $\frac{1}{c} < w < 1$, with equality only if $c = 3 + 2\sqrt{2}$ and $w = 2-\sqrt{2}$. Thus, $h(w)$ is strictly decreasing on $\frac{1}{c} < w < 1$. Note also that $h(\frac{1}{c}^{+}) = \infty$ and $h(1^{-}) = -\infty$. Thus, $h(w) = 0$ has exactly one solution on $\frac{1}{c} < w < 1$. Thus, $g'(w)=0$ has exactly one solution on $(0, \infty)$. Note also that $g'(0) > 0$ and $g'(\infty) = -\infty$. Thus, $g(w)$ is unimodal on $(0, \infty)$.

2) If $c > 3 + 2\sqrt{2}$, then $h'(w) < 0$ for $w\in (\frac{1}{c}, w_1)$, $h'(w) > 0$ for $w\in (w_1, w_2)$ and $h'(w) < 0$ for $w\in (w_2, 1)$ where $w_1, w_2$ are the two distinct real solutions of $h'(w)=0$ on $(\frac{1}{c}, 1)$ given by $$w_1 = \frac{c+1 - \sqrt{c^2-6c+1}}{2c}, \quad w_2 = \frac{c+1 + \sqrt{c^2-6c+1}}{2c}.$$ Thus, $h(w)$ is strictly decreasing on $(\frac{1}{c}, w_1)$, strictly increasing on $(w_1,w_2)$, and strictly decreasing on $(w_2, 1)$. There are two possible cases:

Case I $3+2\sqrt{2} < c \le c_0$: From Fact 1, we have $h(w_2)\le 0$. Since $h(w_1) < h(w_2) \le 0$ and $h(\frac{1}{c}^{+}) = \infty$, there exists $d\in (\frac{1}{c}, w_1)$ such that $h(d) = 0$, $h(w) > 0$ for $w\in (\frac{1}{c}, d)$, $h(w) < 0$ for $w\in (d, w_2)$ and $h(w) < 0$ for $w\in (w_2, 1)$. Note that for $w\in (\frac{1}{c}, 1)$, $$g'(w) = \mathrm{e}^{-w}c(cw-1)\Big(\frac{1-w}{c(cw-1)} - \mathrm{e}^{(1-c)w}\Big).$$ Thus, $g'(w) > 0$ for $w \in (\frac{1}{c}, d)$, $g'(w)<0$ for $w\in (d, w_2)$ and $g'(w) < 0$ for $w\in (w_2, 1)$. Also, clearly $g'(w) > 0$ for $w\in (0, \frac{1}{c}]$ and $g'(w) < 0$ for $[1, \infty)$. Thus, $g'(w)>0$ for $w\in (0, d)$, $g'(w) < 0$ for $w\in (d, w_2)$ and $g'(w) < 0$ for $w \in [w_2, \infty)$. Thus, $g(w)$ is strictly increasing on $(0, d)$, and strictly decreasing on $(d, \infty)$. Thus, $g(w)$ is unimodal on $(0, \infty)$.

Case II $c > c_0$: From Fact 1, we have $h(w_1) < 0$ and $h(w_2)> 0$. Note also that $h(\frac{1}{c}^{+}) = \infty$. Thus, there exists $d_1\in (\frac{1}{c}, w_1)$, $d_2\in (w_1, w_2)$ and $d_3\in (w_2, 1)$ such that $h(w) > 0$ for $w \in (\frac{1}{c}, d_1)$, $h(w) < 0$ for $w\in (d_1, d_2)$, $h(w) > 0$ for $w\in (d_2, d_3)$ and $h(w) < 0$ for $w\in (d_3, 1)$. Note that for $w\in (\frac{1}{c}, 1)$, $$g'(w) = \mathrm{e}^{-w}c(cw-1)\Big(\frac{1-w}{c(cw-1)} - \mathrm{e}^{(1-c)w}\Big).$$ Thus, $g'(w) > 0$ for $w \in (\frac{1}{c}, d_1)$, $g'(w) < 0$ for $w\in (d_1, d_2)$, $g'(w) > 0$ for $w\in (d_2, d_3)$ and $g'(w) < 0$ for $w\in (d_3, 1)$. Also, clearly $g'(w) > 0$ for $w\in (0, \frac{1}{c}]$ and $g'(w) < 0$ for $[1, \infty)$. Thus, $g'(w)>0$ for $w\in (0, d_1)$, $g'(w) < 0$ for $w\in (d_1, d_2)$, $g'(w) > 0$ for $w\in (d_2, d_3)$ and $g'(w) < 0$ for $w\in (d_3, \infty)$. Thus, $g(w)$ is strictly increasing on $(0, d_1)$, strictly decreasing on $(d_1, d_2)$, strictly increasing on $(d_2, d_3)$, and strictly decreasing on $(d_3, \infty)$. Thus, $g(w)$ has exactly two local maxima on $(0, \infty)$. This completes the proof of Lemma 1.