Convergence of $\sum_{t=1}^{\infty}E[X_{t+1}-X_{t}|\mathcal{F_{t+1}}]$

39 Views Asked by Bumbble Comm At 28 Mar 2026 - 2:45

This arises from the proof of Stochastic (Online) Gradient Descent (SGD) convergence in L.Bottou 'Online Learning and Stochastic Approximations (1998).

In it, he shows that the continuous stochastic process $X_t$ s.t. $D_t = X_t - X_{t-1}$ such that $$ \mathbf{E}[D_t|F_t] < C_1 C_2 \alpha^2_t \ \ \ \ \ \ \ \ \ (1) $$ is convergent because $$ \sum_{t=1}^{\infty} \mathbf{E}[D_t|F_t] < C_1 C_2\sum_t\alpha^2_t< \infty $$ $\alpha_t$ is the learning rate, $X_t$ is the loss function in a neural network/MLP framework, i.e. $$ 0<X_t \overset{a.s.}{\to_t} X^{\infty}< \infty \ \ \ \text{(Equation 4.30)} $$ Which is then used to prove that the gradient of the loss function $\nabla X_t \overset{a.s.}{\to_t} 0$ (Equations 5.16-5.21), i.e. the algorithm converges to the local minimum.

My question is: what if we replace (1), for example, with $$ \mathbf{E}[X_{t+1}-X_{t}|\mathcal{F}_t] \sim N(0,1) $$ Will the convergence of $\nabla X_t$ still hold? What I understand is that
$$ \sum_{t=1}^{T}\mathbf{E}[X_{t+1}-X_{t}|\mathcal{F}_t] = 0 $$ since we do not need to worry about the conditions of Fubini-Tonelli theorem, the sum of infinite expectation is still $0$. If it is, then by the same argument as in the paper, $X_t \overset{a.s.}{\to_t} X^{\infty} < \infty$ and $\nabla X_t \overset{a.s.}{\to}0$. Is this reasoning correct?

Original Q&A

Convergence of $\sum_{t=1}^{\infty}E[X_{t+1}-X_{t}|\mathcal{F_{t+1}}]$

Related Questions in STOCHASTIC-PROCESSES

Related Questions in GRADIENT-DESCENT

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions