A basic question on stochastic gradient descent

177 Views Asked by Bumbble Comm At 10 May 2026 - 2:45

Consider a stochastic gradient iteration:

$$\theta_{k+1} = \theta_{k} - \gamma_k F(\theta_k)$$

where $F$ is a noisy estimate of the gradient $\nabla f$

Now, a book says that it converges in the following sense : $f(\theta_k)$ converges and $\nabla f(\theta_k)$ converges to zero and then it says that it is the strongest possible result for gradient related stochastic approximation.

What is the meaning of it ? Why does not it shows the convergence of the iterates ?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 15 Aug 2014 - 12:53 BEST ANSWER

The reason why it says this for stochastic gradient descent is because even though it will converge it may not converge to that point completely but stay in a small interval around the point.

Take a look at this video on stochastic gradient descent and it should clear things up: https://class.coursera.org/ml-005/lecture/105

A basic question on stochastic gradient descent

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in STOCHASTIC-APPROXIMATION

Trending Questions

Popular # Hahtags

Popular Questions