The Mabinogion sheep problem

539 Views Asked by At

$\textbf{The Mabinogion sheep problem}$ By David Williams

There is a magical flock of sheep, some black, some white. We sacrifice poetry for precision in specifying its behaviour. At each of times 1,2,3,… a sheep (chosen randomly from the entire flock, independently of previous events) bleats; if this bleating sheep is white, one black sheep (if any remain) instantly becomes white; if the bleating sheep is black, then one white sheep (if any remain) instantly becomes black. No births or deaths occur.

The controlled system.
Suppose now that the system can be controlled in that just after time 0 and just after every magical transition time 1,2,…, any number of white sheep may be removed from the system. (White sheep may be removed on numerous occasions.) The object is to maximize the expected final number of black sheep.

Policy $\bf{A}$ : at each time of decision, do nothing if there are more black sheep than white sheep or if no blakc sheep remain; otherwise, immediately reduce the white population to one less than the black population.

We define the value function $V : Z^{+} × {Z} ^{+} → [0, ∞)$ so that $V (w, b)$ is the expected final number of black sheep under Policy $\bf{A}$ if there are $w$ white sheep and $b$ black sheep at the start. As a result, V has the following properties,

a1) $V(0,b)=b, V(w,0)=0$ .
a2) $V(w,b)=V(w-1,b)$ whenever $w\ge b>0$
a3) $V(w,b) = \frac{w}{w+b}V(w+1,b-1)+\frac{b}{w+b}V(w-1,b+1)$ whenever $0<w<b$

$\textbf{Question (a)}$ : Show that under the policy $\textbf{A}$, $V(W_n,B_n)$ is a martingale w.r.t. filtration $\mathcal{F}$, where $\{W_n,B_n\}$ is number of white and black sheep after $n_{th}$ time.

$\textbf{Question (b)}$ : Suppose $V(w,b) \ge \frac{w}{w+b}V(w+1,b-1)+\frac{b}{w+b}V(w-1,b+1)$ and $V(w,b) \ge V(w-1,b)$ whenever $w\ge0$ and $b \ge 0$.
Show that for any other policies, $V(W_n,B_n)$ is a super-martingale w.r.t. filtration $\mathcal{F}$.

Recently, I am studying martingale and its application. And I am very curious about this question in the textbook. This is a famous stochastic control problem by David Williams, called The Mabinogion sheep problem. The following is described in (Williams, David (1991), Probability with martingales, Cambridge Mathematical Textbooks, Cambridge University Press)

In the David Williams's book, he just mentioned about a1)-a3) three properties but not proved them.

$\textbf{My attempt}$:

$\textbf{The property a1)-a3) are what we have known, they are very easy to show.}$

$\textbf{For my proof to Question (a)}$

$E\{V(W_{n+1},B_{n+1})|\mathcal{F_n}\} = \frac{W_n}{W_n+B_n}V(W_{n}+1,B_{n}-1) + \frac{B_n}{W_n+B_n}V(W_{n}-1,B_{n}+1) =V(W_{n},B_{n})$.
Therefore, $V(W_{n},B_{n})$ is a martingale w.r.t. $\mathcal{F_n}$

$\textbf{For my proof to Question (b)}$

$E\{V(W_{n+1},B_{n+1})|\mathcal{F_n}\} = \frac{W_n}{W_n+B_n}V(W_{n}+1,B_{n}-1) + \frac{B_n}{W_n+B_n}V(W_{n}-1,B_{n}+1) < V(W_{n},B_{n})$.
Therefore, $V(W_{n},B_{n})$ is a super-martingale w.r.t. $\mathcal{F_n}$

Do I miss some points for proving (a) and (b)? Could you help me to check my answer?

$\textbf{Edit:}$ .

$\textbf{This question has not been answered for Question (a) and (b)}!$ .
$\textbf{Could I get some helps from these two question? Thank you so much!}$

1

There are 1 best solutions below

9
On

There isn't really anything to prove. The properties of $V$ come from the description of Policy A.

There's also no need to consider an arbitrary $n.$ For this kind of first-step analysis you only need to consider the case $n=0$ and use time-invariance. But I'll use $(W_n,B_n)$ to refer to the number of white and black sheep just after time $n,$ after a sheep has changed color if possible ($n>0$) but before any sheep are removed by the control policy.

a1 - immediate from the definition of $V$ as the expected final number of black sheep, for the obvious cases where all the sheep are black or all are white

a2 - by the Policy A rule "immediately reduce the white population to one less than the black population", a system with $(W_0,B_0)=(w,b)$ with $w\geq b>0$ has the same behavior as the system with $(W_0,B_0)=(w',b),$ for any $w'\geq b-1$

a3 - this is the expected value of $V(W_1,B_1)$ when the control rule "do nothing if there are more black sheep than white sheep" is used, using the distribution of $(W_1,B_1)$ as you state in the rough idea at the end of your question, with $a_n=0$