Why do bellman error gradients become big?

49 Views Asked by Bumbble Comm At 28 Mar 2026 - 2:37

I read these notes on deep q learning (DQN) and it said that Bellman error gradients can become pretty big. While watching the lecture video for that slide, the speaker basically said that we are using the gradient of a squared bellman error and the derivative of a quadratic can become a large value. I am a little lost why that is the case? DQN is basically performing regression and we don't seem to worry about regression gradients being large.

Here's the typical loss function of a typical DQN -

$$L(\theta) = 1/2(R_{t+1} + \gamma[\![ \max_{a}q_{\theta}(S_{t+1,a})]\!] - q_{\theta}(S_{t}, A_t))^2$$

Here, $R_{t+1}, S_{t}, A_t$ denote the reward, state and action at the current time step. $S_{t+1}$ denotes the next state and $q_{\theta}$ the q values. Also, we do not take the gradient of the term within $[\![ ]\!]$ as this is not a true gradient method but semi-gradient method.

Original Q&A

Why do bellman error gradients become big?

Related Questions in MACHINE-LEARNING

Related Questions in REGRESSION

Related Questions in GRADIENT-DESCENT

Related Questions in DYNAMIC-PROGRAMMING

Trending Questions

Popular # Hahtags

Popular Questions