I am currently working on reinfocement learning and there is this Bellman Equation which I need, so I can minimize the loss-Function calculated by my neural Network. When we calculate the loss, we compare the Q-Value that is generated by my neural Network q(s,a) and subtract that from the optimal Q-Value q*(s,a). I dont understand the difference between q* and q, because if we already have the optimal q-Value, then why do I even bother to compute q(s,a)? Or in Q-Learning, where I can look in my Q-Table to get the maxarg(q(s´,a)) to update my table. I dont understand the difference between those two because right now the way I get my q(s,a) is the same as q*(s,a). Help is really appreciated, I googled the whole weekend and couldnt find any solution.
2026-03-28 04:22:49.1774671769
Bellmann Equation loss function optimal Q-Value
99 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in LINEAR-ALGEBRA
- An underdetermined system derived for rotated coordinate system
- How to prove the following equality with matrix norm?
- Alternate basis for a subspace of $\mathcal P_3(\mathbb R)$?
- Why the derivative of $T(\gamma(s))$ is $T$ if this composition is not a linear transformation?
- Why is necessary ask $F$ to be infinite in order to obtain: $ f(v)=0$ for all $ f\in V^* \implies v=0 $
- I don't understand this $\left(\left[T\right]^B_C\right)^{-1}=\left[T^{-1}\right]^C_B$
- Summation in subsets
- $C=AB-BA$. If $CA=AC$, then $C$ is not invertible.
- Basis of span in $R^4$
- Prove if A is regular skew symmetric, I+A is regular (with obstacles)
Related Questions in EXPECTED-VALUE
- Show that $\operatorname{Cov}(X,X^2)=0$ if X is a continuous random variable with symmetric distribution around the origin
- prove that $E(Y) = 0$ if $X$ is a random variable and $Y = x- E(x)$
- Limit of the expectation in Galton-Watson-process using a Martingale
- Determine if an Estimator is Biased (Unusual Expectation Expression)
- Why are negative constants removed from variance?
- How to find $\mathbb{E}(X\mid\mathbf{1}_{X<Y})$ where $X,Y$ are i.i.d exponential variables?
- $X_1,X_2,X_3 \sim^{\text{i.i.d}} R(0,1)$. Find $E(\frac{X_1+X_2}{X_1+X_2+X_3})$
- How to calculate the conditional mean of $E(X\mid X<Y)$?
- Let X be a geometric random variable, show that $E[X(X-1)...(X-r+1)] = \frac{r!(1-p)^r}{p^r}$
- Taylor expansion of expectation in financial modelling problem
Related Questions in MACHINE-LEARNING
- KL divergence between two multivariate Bernoulli distribution
- Can someone explain the calculus within this gradient descent function?
- Gaussian Processes Regression with multiple input frequencies
- Kernel functions for vectors in discrete spaces
- Estimate $P(A_1|A_2 \cup A_3 \cup A_4...)$, given $P(A_i|A_j)$
- Relationship between Training Neural Networks and Calculus of Variations
- How does maximum a posteriori estimation (MAP) differs from maximum likelihood estimation (MLE)
- To find the new weights of an error function by minimizing it
- How to calculate Vapnik-Chervonenkis dimension?
- maximize a posteriori
Related Questions in RECURSION
- Solving discrete recursion equations with min in the equation
- Recognizing recursion relation of series that is solutions of $y'' + y' + x^2 y = 0$ around $x_0 = 0$.
- Ackermann Function for $(2,n)$
- Primitive recursive functions of bounded sum
- Ackermann Function for $f(2,n)$ as compared to $f(5,1)$
- Determinant of Block Tridiagonal Matrix
- In how many ways can the basketball be passed between four people so that the ball comes back to $A$ after seven passes? (Use recursion)
- Finding a recursive relation from a differential equation.
- A recursive divisor function
- Are these numbers different from each other?
Related Questions in DYNAMIC-PROGRAMMING
- Dynamic programming for Knapsack problem
- DP algorithm for covering the distance between two points with a set of intervals
- Solution of an HJB equation in continuous time
- correctness for minimizing average completition time for scheduling problem with release times
- Zero-sum differential game
- An enclosing polygon with minimum area
- Divide set into two subsets of equal sum and maximum this sum
- Stochastic Dynamic Programming: Deriving the Steady-State for a Lottery
- How would you prove that a dynamic programming problem is solvable by a greedy algorithm?
- How to find minimal distances route for a trip of $t$ days, given distances for each stop?
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
Okay I just realized my mistake when I programmed it. Its actually quite obvious. The q-value of q(s,a) should be equal to the calculation we do with the Bellman Equation.
As you can see, q_optimal is calculated differently than q_now, so there are definetely not the same, I didnt get that at first.