Can gradient descent be written without time step?

66 Views Asked by Bumbble Comm At 08 Apr 2026 - 7:15

I am trying to learn gradient descent for machine learning. In this highly cited research paper https://arxiv.org/pdf/1609.04747.pdf, the author presents the gradient descent as

$$\theta = \theta - \eta \nabla_\theta J(\theta)$$

I have never seen this expression before. Is this some analytical formula for calculating the variables $\theta$? Wouldn't the $\theta$ be cancelled out? I am confused, please help.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 15 Apr 2018 - 12:59

As @CogitoErgoCogitoSum mentioned in the comments, the iteration should be written as $$ \theta^{k+1} = \theta^k - \eta \nabla J(\theta^k). $$ Starting at the point $\theta^k$, we take a step in the direction of steepest descent (that is, the negative gradient direction), which moves us to a new point $\theta^{k+1}$ where the value of $J$ has been reduced.

Can gradient descent be written without time step?

There are 1 best solutions below

Related Questions in OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions