Math behind various optimizers in deep learning

236 Views Asked by Bumbble Comm At 10 May 2026 - 4:30

I am looking forward to learn the math behind the various optimizers present in neural networks such as adam,sgd,etc. Through this, I want to learn what makes one optimizer better than another for a particular case. Is anyone aware of resources that explain the math behind the various optimizer in deep learning?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 05 Aug 2019 - 1:22 BEST ANSWER

These algorithms are based on linear algebra, multivariable calculus and probability theory (when the algorithms are stochastic). Specifically, rather than attempting to learn generic results from these branches, I would recommend starting with reviews of different optimisation algorithms, such as 1 or the more in-depth 2, and refer back to Khan-academy/Wikipedia if some of the mathematical formalisms are unclear.

Once you have gained an overview and intuition of a few basic algorithms, you should start reading the relevant literature on optimisation in deep learning, c.f. 3, 4 for reviews, and then the relevant references from these articles.

Math behind various optimizers in deep learning

There are 1 best solutions below

Related Questions in REFERENCE-REQUEST

Related Questions in OPTIMIZATION

Related Questions in BOOK-RECOMMENDATION

Related Questions in MACHINE-LEARNING

Related Questions in ONLINE-RESOURCES

Trending Questions

Popular # Hahtags

Popular Questions