Given that a policy is a function from a state action pair to probabilities, the set of policies for a MDP forms a POSET (the partial order is due to value function for a policy). Why there should be a maximal element in that POSET ? Are all the maximal elements the same.
2026-03-25 04:44:02.1774413842
A doubt on markov decision process
149 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in PROBABILITY
- How to prove $\lim_{n \rightarrow\infty} e^{-n}\sum_{k=0}^{n}\frac{n^k}{k!} = \frac{1}{2}$?
- Is this a commonly known paradox?
- What's $P(A_1\cap A_2\cap A_3\cap A_4) $?
- Prove or disprove the following inequality
- Another application of the Central Limit Theorem
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- A random point $(a,b)$ is uniformly distributed in a unit square $K=[(u,v):0<u<1,0<v<1]$
- proving Kochen-Stone lemma...
- Solution Check. (Probability)
- Interpreting stationary distribution $P_{\infty}(X,V)$ of a random process
Related Questions in CONTROL-THEORY
- MIT rule VS Lyapunov design - Adaptive Control
- Question on designing a state observer for discrete time system
- Do I really need quadratic programming to do a Model Predictive Controller?
- Understanding Definition of Switching Sequence
- understanding set of controllable state for switched system
- understanding solution of state equation
- Derive Anti Resonance Frequency from Transfer Function
- Laplace Transforms, show the relationship between the 2 expressions
- Laplace transform of a one-sided full-wave rectified...
- Controlled Markov process - proper notation and set up
Related Questions in MARKOV-PROCESS
- Definition of a Markov process in continuous state space
- What is the name of the operation where a sequence of RV's form the parameters for the subsequent one?
- Given a probability $p$, what is the upper bound of how many columns in a row-stochastic matrix exceed $p$?
- Infinitesimal generator of $3$-dimensional Stochastic differential equation
- Controlled Markov process - proper notation and set up
- Easy way to determine the stationary distribution for Markov chain?
- Why cant any 3 events admit Markov Property?
- Absorbing Markov chain and almost sure convergence
- Transition probabilities for many-states Markov model
- How to derive a diffusion tensor and stationary states given a Markov process transition matrix?
Related Questions in DECISION-THEORY
- Generating cycles on a strongly connected graph
- Stochastic decision problem with normal distribution
- Is the halting problem also undecideable for turing machines always writing a $1$ on the tape?
- How to prove inadmissibility of a decision rule?
- Can the halting problem for bounded Turing machines be efficiently decided?
- Can these statements help to take above conclusion?
- How to prove an estimator is minimax
- Finding P(Error) in a Hypothesis Test for Population Mean $\mu$
- What are some natural ways to compare random variables?
- Maximum likelihood decision rule
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
I agree that the existence of a maximal element isn't obvious. Indeed, a form of your first question (i.e. the existence of an optimal policy) has occupied researchers in MDPs since the field got its start in the mid-20th century.
It was shown early on (by Ronald Howard in 1960 and David Blackwell in 1962) that for discounted and average-cost MDPs with finite state and action sets, a greatest element always exists and can be constructed in a finite number of steps - see e.g. Chapters 6, 8, and 9 in the book Markov Decision Processes: Discrete Stochastic Dynamic Programming by Martin Puterman.
But, if either the state set or action sets are infinite, there might not be a maximal element without some additional assumptions - see e.g. Section 6.6 in Applied Probability Models with Optimization Applications by Sheldon Ross.
Regarding your second question, I'm not aware of any examples where there is no greatest element but more than one maximal element. It might be possible to construct one using one of the examples of MDPs with no optimal policy.