Does this lecture note contain an incorrect inequality about convergence of Gradient Descent?

123 Views Asked by At

I'm reading a lecture note about convergence of Gradient Descent. My problem lies in Theorem 6.1

enter image description here

and its proof

enter image description here enter image description here enter image description here enter image description here

Could you please explain how the author gets the last inequality $$f\left(x^{(k)}\right)-f\left(x^{*}\right) \leq \frac{1}{k} \sum_{i=1}^{k} \left (f\left(x^{(i)}\right)-f\left(x^{*}\right) \right)$$

It seems to me that the correct inequality is actually the opposite one, i.e., $$f(x^{(k)})-f(x^{*}) \ge\frac{1}{k} \sum_{i=1}^{k} \left (f(x^{(i)})-f(x^{*}) \right)$$

My reasoning is as follows: Because $f$ decreasing on every iteration, $$f(x^{(k)})-f(x^{*}) \ge f(x^{(i)})-f(x^{*}) \ge$$ for all $i \le k$. It follows that $$\sum_{i=1}^{k} (f(x^{(k)})-f(x^{*})) \ge \sum_{i=1}^{k} (f(x^{(i)})-f(x^{*}))$$ Thus $$k(f(x^{(k)})-f(x^{*})) \ge \sum_{i=1}^{k} (f(x^{(i)})-f(x^{*}))$$

Thank you so much for your help!

1

There are 1 best solutions below

0
On BEST ANSWER

I just formulate what @PhoemueX's explanation in the comment to answer this question.


It follows from $f$ decreases after each step that $$f(x^\star) \le f(x_{k+1}) \le f(x_k) \le \cdots \le f(x_0)$$

Then $$f(x_{k+1})-f(x^\star) \le f(x_n)-f(x^\star)$$ for all $n \le k$. Consequently, $$f(x_{k+1})-f(x^\star) \le \frac{1}{k+1} \sum_{n=0}^k (f(x_{n+1})-f(x^\star)) \le \frac{L\|x_0-x^\star\|^2}{2(k+1)}$$ Finally, we get $$f(x_{k+1})-f(x^\star) \le \frac{L\|x_0-x^\star\|^2}{2(k+1)}$$