References on how variance of gradient oracles impact step sizes

20 Views Asked by At

I heard the following from a seminar that "High variance gradient oracles result in smaller step size that is inversely proportional to the variance" (for stochastic gradient descent). However, I could not find the specific reference that demonstrate this point after doing some Googling. Could someone please point me to some references? Many thanks!