I've written a function of my own, and I'm trying to optimize it with gradient descent, but I don't know how to update each variable, as one variable may control an angle, and another variable may control an area.
I've tried to optimize my parameters each step by:
vars = vars - (learning_rate * error * vars)
But it doesn't seems to work as magnitude of parameters may differ. I've also tried:
vars = vars - (learning_rate * error *derivative_of_vars)
To what should I change my update rule? Should I take the derivative of the error function in relation to the derivative of every parameter?
I think you are looking for:
vars = vars - (learning_rate * gradient_of_error(vars))In the one-dimensional case, the gradient of the error function is its derivative.
Source (wiki)