How to solve the optimality equation? (Markov decision process)

50 Views Asked by At

I'm struggling with this problem I have to solve, I have attached the problem below. I have done some questions that are similar but I have given simple values for 'a' and 's'. If someone could help me please that would be appreciated. enter image description here Thank you.

1

There are 1 best solutions below

0
On

To start, consider $t=2$: \begin{align} V_2(s) &=\min_{a\in A} \{c(s,a)+E[V_3(Y)|s,a]\}\\ &=\min_{a\in A} \{a^2+s^2+E[0]\}\\ &=\min_{a\in A} \{a^2+s^2\} =0^2+s^2 =s^2, \end{align} with minimizer $a_3^*(s)=0$.

Next, \begin{align} V_1(s) &=\min_{a\in A} \{c(s,a)+E[V_2(Y)|s,a]\}\\ &=\min_{a\in A} \{a^2+s^2+E[V_2(s+a+\xi)]\}\\ &=\min_{a\in A} \{a^2+s^2+E[(s+a+\xi)^2]\}\\ &=\min_{a\in A} \{a^2+s^2+E[(s+a)^2+2(s+a)\xi+\xi^2]\}\\ &=\min_{a\in A} \{a^2+s^2+(s+a)^2+2(s+a)E[\xi]+E[\xi^2]\}\\ &=\min_{a\in A} \{a^2+s^2+(s+a)^2+2(s+a)0+1\}\\ &=\min_{a\in A} \{2a^2+2as+2s^2+1\} =3s^2/2+1, \end{align} with minimizer $a_2^*(s)=-s/2$.

Finally, \begin{align} V_0(s) &=\min_{a\in A} \{c(s,a)+E[V_1(Y)|s,a]\}\\ &=\min_{a\in A} \{a^2+s^2+E[V_1(s+a+\xi)]\}\\ &=\min_{a\in A} \{a^2+s^2+E[3(s+a+\xi)^2/2+1]\}\\ &=\min_{a\in A} \{a^2+s^2+3E[(s+a)^2+2(s+a)\xi+\xi^2]/2+1\}\\ &=\min_{a\in A} \{a^2+s^2+3(s+a)^2/2+3(s+a)E[\xi]+3E[\xi^2]/2+1\}\\ &=\min_{a\in A} \{a^2+s^2+3(s+a)^2/2+3(s+a)0+3\cdot 1/2+1\}\\ &=\min_{a\in A} \{5 a^2/2 + 3 a s + 5 s^2/2 + 5/2\} =8s^2/5+5/2, \end{align} with minimizer $a_1^*(s)=-3s/5$.