Minimizing univariate quadratic via gradient descent — choosing the step size

194 Views Asked by At

I'm learning gradient descent method and I saw different (and opposite) things on my referrals.

I have the following function

$$f(x) = 2x^2 - 5x$$

and I have to calculate some iterations of gradient descent from $x_0 = 1$. So, I calculate the function at $x_0$, the derivative of the function at $x_0$ and now I have to apply the formula

$$x_1 = x_0 - \alpha \cdot f'(x_0)$$

Is $\alpha$ randomly chosen or do I have to force the formula to $0$ value? I'm quite confused.

2

There are 2 best solutions below

0
On BEST ANSWER

The way you choose $\alpha$ depends, in general, on the information you have about your function. For example, for the function in your example, it is

$$ f'(x) = 4x - 5 $$

and $f''(x) = 4$, so $f'$ is Lipschitz continuous with Lipschitz constant $L=4$. You should then choose $a$ to be smaller than $1/L$, so, in this case, $a<0.25$.

In general, you might not know $L$. Then you have to resort to a linesearch method (e.g., exact linesearch or Armijo's rule).

You can read Chapter 3 in the book of Nocedal and Wright.

2
On

According to the Wikipedia article on gradient descent, $\alpha$ is a positive real number. You should choose a small $\alpha$, such as $\alpha = 0.1$ in your case to avoid going past the minimum value.