Consider a stochastic gradient iteration:
$$\theta_{k+1} = \theta_{k} - \gamma_k F(\theta_k)$$
where $F$ is a noisy estimate of the gradient $\nabla f$
Now, a book says that it converges in the following sense : $f(\theta_k)$ converges and $\nabla f(\theta_k)$ converges to zero and then it says that it is the strongest possible result for gradient related stochastic approximation.
What is the meaning of it ? Why does not it shows the convergence of the iterates ?
The reason why it says this for stochastic gradient descent is because even though it will converge it may not converge to that point completely but stay in a small interval around the point.
Take a look at this video on stochastic gradient descent and it should clear things up: https://class.coursera.org/ml-005/lecture/105