Take the rather simple and restrictive setting where we have an objective function $f: \mathbb{R}^n \rightarrow \mathbb{R}$ that is doubly differentiable, strongly convex and has Lypschitz continuous gradient, i.e, $f$ is such that $$ \ell I \le \nabla^2 f(x) \le L I $$ where $I$ is the identity matrix and $0< \ell < L \in \mathbb{R}$ . Call its unique minimum point $x^*$.
Suppose we were only given a sequence of independent random variables $\{ X_1 , X_2, \dots, X_n \}$ s.t. $E_{X_1}[ X_1 ] =\nabla f( x_1), E_{X_2}[ X_2 ] =\nabla f( x_2)$ and so fourth and we were to apply stochastic gradient descent from an arbitrary point $x_0$ to the objective function $f$, that is recursively computing
$$x_{k+1} = x_k - \alpha X_k $$
and we want to prove convergence of this scheme, could we just proceed as follows:
Notice that from the mean value theorem we have $$\nabla f(x_k) = \nabla f(x_k) - \nabla f( x^*) = \nabla^2 f(c) ( x_k - x^* ) $$ where $c$ is a point in between $x_k $ and $x^*$. So taking expectations wrt the joint probability of $X_1, X_2, ... X_n $ we obtain $$ E[\nabla f(x_k)] = E[\nabla^2 f(c)] ( E[x_k] - x^* ) $$
where $\ell I \le E[\nabla^2 f(c)] \le LI$ at this point notice that
$$ || E[x_{k+1}] - x^* || = || E[x_{k}] - \alpha E[ X_k ] - x^* || = || E[x_{k}] - \alpha E[\nabla f(x_k) ] - x^* || \le || I - \alpha E[\nabla^2 f(c)] || || E[x_{k}] - x^* || $$ and utilizing the fact that $|| I - \alpha E[\nabla^2 f(c)] ||= \max\{ |1- \lambda_1 |, |1- \lambda_n | \} \le \max\{ |1- \alpha \ell |, |1- \alpha L | \} $ we can choose the stepsize $0 <\alpha < L /2$ to conclude that $$ \lim_{k \rightarrow \infty} || E[x_{k+1}] - x^* || = 0$$
Would this be a standard proof for the convergence of stochastic gradient descent? I recall it being longer.