Gradient norm in a neural network is bounded?

177 Views Asked by Bumbble Comm At 28 Mar 2026 - 10:55

Consider a fully connected neural network with single hidden layer $f(x,w) = w^T_2 \sigma(w^T_1 x) $ where $w = [ w_2, w_1 ]$ are networks' parameters and $\sigma$ is an activation function (e.g tanh, sigmoid, relu). Let $l(f(x,w), y)$ be the binary cross entropy loss. Is that true that for a particular data point $x$, the gradient norm of the loss function $\| \nabla_w l(f(x,w),y)\|_2 \leq C$ is always bounded over all choices of model parameters $w$.

Original Q&A

Gradient norm in a neural network is bounded?

Related Questions in GRADIENT-DESCENT

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions