Solution verification: Divergence of gradient descent (recursive sequence)

104 Views Asked by At

Consider the recursion: $$x_{k+1} = x_k - \frac{\alpha_0}{k^p} \left( e^{x_k} - e^{-x_k} \right)$$

for finite $p$ and $\alpha_0 > 0$.

Show that when $x_1$ is large enough, $x_k$ grows super-exponentially, for all $k$:

$$\log\left(\left| \frac{x_{k+1}}{x_k} \right |\right) \geq 2^k$$

What I've Tried

It's actually not clear to me that the argument in the log above is always positive, for large $x_1$ we have:

$$x_2 = x_1 - \alpha_0\left(e^{x_1} - e^{-x_1}\right)$$

For large enough $x_1$ I'd expect the second term to be larger than the first term, so how can $x_1$ and $x_2$ have the same sign?

EDIT: I updated the above to include an absolute value in the argument of the logarithm.

I initially tried bounding the exponentials with linear terms, but that is too coarse.

First Ill try to show that $x_k$ is an increasing sequence.

Checking the condition for $|x_2|$:

$$|x_2| = |x_1 - \alpha_0(e^{x_1}-e^{-x_1})|$$

is larger than $|x_1|$ when $x_1 \leq 0$ or $\alpha_0/1^p(e^{x_1}-e^{-x_1}) \geq 2x_1$ or $\alpha_0/1^p(e^{x_1}-e^{-x_1}) \leq 2x_1$. These conditions can be met when $x_1$ is large enough in magnitude. For $|x_k|$ we get the same expression in terms of $|x_{k-1}|$ instead of in terms of $x_1$. Since our condition is met when $|x_{k-1}|$ is large enough, we can ignore the signs and simply choose a $x_1$ such that $|x_k|$ is large enough. This can be done for any $k$.

Since $x_k = O(e^{x_{k-1}})$ we can iterate this to arrive at $x_k = O(e^{e^{x_{k-2}}})$ which shows that $|x_k|$ grows superexponentially.

I could use some feedback on my arguments.

This is example 3 in [1]

References

[1] Asi, H., Duchi, J. (2019). The importance of better models in stochastic optimization arXiv https://arxiv.org/abs/1903.08619

1

There are 1 best solutions below

2
On BEST ANSWER

$\color{brown}{\textbf{Preliminary notes.}}$

Since the derivative $$(z\ln a - c\ln z)' = \ln a -\dfrac cx$$ has the single root $\;z_m = \dfrac c{\ln a},\;$ then the function $\;a^z - z^c\;$ increases when $z\in(z_m, \infty).$

Let $$b=4\left(\ln\,\frac{11}{10}\right)^{-1}\,\max\left(1,\,\ln\,\dfrac2{\alpha_0},\,p \right),\tag1$$ then under the condition $\;y \ge \max(b,k)\;$ are valid the inequalities

  • $\left(\frac{11}{10}\right)^y > \dfrac2{\alpha_0};$
  • $\left(\frac{11}{10}\right)^y > y;$
  • $\left(\frac{11}{10}\right)^y > k^p;$
  • $\;e^y - e^{-y} > \left(\dfrac83\right)^y > 2^y\left(\dfrac{11}{10}\right)^{3y} > \dfrac{2y\, k^p}{\alpha_0}\,2^y.\tag2$

$\color{brown}{\textbf{Recurrence relations for the majorants.}}$

Assuming $\;x_1 = b,\;$ one can get $$x_{2} = x_1 - \dfrac{\alpha_0}{1^p}\left(e^{x_1}-e^{-x_1}\right) < x_1 - 2x_1\,2^{x_1},$$ $$|x_2| > |x_1|\,2^{|x_1|} > \max(b,2),\tag3$$ $$x_{3} = x_2 + \dfrac{\alpha_0}{2^p}\left(e^{-x_2}-e^{x_2}\right) > -|x_2| + 2|x_2|\,2^{|x_2|},$$ $$|x_3| > |x_2|\,2^{|x_2|} > \max(b,3),\dots,\tag4$$ $$|x_{k+1}| > |x_k|\,2^{|x_k|} > \max(b,k+1).\tag5$$

From $(5)$ should $$\ln|x_{k+1}| - \ln |x_k| > |x_k| \ln2 > |x_{k-1}|\,2^{|x_{k-1}|} \ln2 > k\,2^k\,\ln2,$$

$$\color{brown}{\mathbf{\ln\left|\dfrac{x_{k+1}}{x_k}\right| > 2^k.}}$$