Gradient Descent Update Rule

331 Views Asked by At

I've written a function of my own, and I'm trying to optimize it with gradient descent, but I don't know how to update each variable, as one variable may control an angle, and another variable may control an area.

I've tried to optimize my parameters each step by:

vars = vars - (learning_rate * error * vars)

But it doesn't seems to work as magnitude of parameters may differ. I've also tried:

vars = vars - (learning_rate * error *derivative_of_vars)

To what should I change my update rule? Should I take the derivative of the error function in relation to the derivative of every parameter?

1

There are 1 best solutions below

1
On BEST ANSWER

I think you are looking for:

vars = vars - (learning_rate * gradient_of_error(vars))

In the one-dimensional case, the gradient of the error function is its derivative.

Source (wiki)