Improving the convergence of gradient descent method

52 Views Asked by At

I am looking for methods to improve the gradient descent algorithm. In particular, my problem is that i'm trying to minimise the sum of squared errors between some observed data and the data produced by my model (based on non-linear differential equations). I'm using gradient descent to find a better fit. But the problem is that the errors are very large due to the nature of the inputs involved. This is leading to both a very slow convergence and an overflow of the variables. So my questions are:

Is there any way to ensure that i do not have to deal with such large numbers. Like scaling or taking logarithms?

Are there other ways to get faster convergence from the algorithm?

Thanks for the help.