So we have gradient descent: $$x^{(i+1)} = x^{(i)} - \tau \nabla f(x^{(i)})$$
And we gotta show that $$\left|\nabla f\left(x^{(j)}\right)\right| \to 0$$
The conditions are:
$f: \mathbb R^n \to \mathbb R$ is differentiable
$\nabla f: \mathbb R^n \to \mathbb R^n$ is lipschitz continuous, so there exists an $L > 0$ such that: $$|\nabla f(x)-\nabla f(y)| \leq L|x-y|,\ \forall x,y \in \mathbb R^n$$
$\tau < \frac{1}{L}$
I've been trying to solve this for like an hour and can't get any further, can someone point me in the right direction?
I've been trying the following:
Assume there exists an $\epsilon > 0$, such that $|\nabla f(x^{(j)})| \geq \epsilon$ for infinitely many j but I haven't been able to find any contradiction with the assumptions.
Start by proving that:
$$f(y) \le f(x) + \nabla f(x)^\intercal (y-x) + \frac L2 \left\|x-y\right\|^2$$
and then you will have:
\begin{align} f\left(x^{(t+1)}\right) \le f\left(x^{(t)}\right) - \tau \left(1 - \frac12\tau L\right)\left\|\nabla f\left(x^{(t)}\right)\right\|^2 \end{align}
If $f$ is bounded below by $\mu$, then
\begin{align} \tau \left(1 - \frac12\tau L\right) \sum_{t=0}^T \left\|\nabla f\left(x^{(t)}\right)\right\|^2 &\le \sum_{t=0}^T \left(f\left(x^{(t)}\right) - f\left(x^{(t+1)}\right)\right)\\ &= f\left(x^{(0)}\right) - f\left(x^{(T+1)}\right)\\ &\le f\left(x^{(0)}\right) - \mu < \infty \end{align}