Gradient descend: What is the correlation of the different terms in the equation

40 Views Asked by Bumbble Comm At 12 Apr 2026 - 9:32

I understand how the sequential gradient descend works, but I fail to understand the equation itself on how the next better weight wj is calculated. I can't visualize it graphically.

What exactly does the second term in the equation do? is it the slope on the selected point or is it the tangent on the point or what is it?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 18 Mar 2022 - 5:01

To have an intuitive idea of what's going on consider the simple example of minimising the convex single-variable function: $J(w) = (w-4)^2$ where $w \in \mathbb{R}$. Clearly, this function attains its minimum $J^*=0$ at: $w^* = 4$. Note that $\frac{dJ}{dw} = 2w - 8$. Hence, our update equation reads:

$$ w_{i+1} := w_{i} - \alpha (2w_{i}-8) $$

Now, let's imagine two scenarios: in the first, you have some $w_i$ which is bigger than 4. Not that, in such a case, $\frac{dJ}{dw}>0$. Hence, the update formula will make sure that $w_{i+1}<w_{i}$ in order to head towards the minimiser ($w=4$) (assuming $\alpha$ is positive of course). In the second scenario, i.e., $w_i <4$, we'll have $\frac{dJ}{dw}<0$ and the update formula will make sure that $w_{i+1}>w_{i}$ in order to head towards the minimiser.

Finally, $\alpha$ controls how big the difference between each consecutive steps is. The following two pictures depicts the objective function we're trying to minimise, and the $w_i$'s computed by the iteration formula for $w_0 = 7$ and two different values and $\alpha = 0.8$ and $\alpha = 0.01$ for the upper and lower pictures, respectively. Note that, with a bigger $\alpha$ you'll probably bounce back and fourth around the minimiser.

Gradient descend: What is the correlation of the different terms in the equation

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions