I am trying to follow an example in which dynamic programming (DP) is applied to a stock option. I'm familiar with option theory but I'm new to DP.
$F_s(x)$ is the value of the american call option with s days to go, x is the stock price, p is the strike price, so at maturity:
$F_0(x)=max\{x-p,0\}$
and DP equation is:
$F_s(x) = max\{x-p, E[F_{s-1}(x + \epsilon)] \}$ s=1,2...
We are asked to show that:
1.) F_s(x) is not-decreasing in s
2.) F_s(x) - x is non-increasing in x
3.) show that F_s(x) is continuous in x, and deduce that the there is a non-decreasing sequence {$a_s$} and the optimal policy is to exercise the option the first time that $x \geq a_s$.
The first part is easy to prove and is inline with option theory.
$F_1(x)=max\{ x-p, E[F_0(x+\epsilon)]\} \geq max\{ x-p, 0\} $
Then proceeding inductively assuming $F_{s-1} \geq F_{s-2}$:
$F_s(x)=max\{ x-p, E[F_{s-1}(x+\epsilon)]\} \geq max\{ x-p,E[F_{s-2}(x+\epsilon)]\} = F_{s-1}(x) $
Thus showing that $F_s(x)$ is non decreasing in s.
The next step is to proceed inductively again for part 2:
$\underbrace{(F_s(x) - x)} =max\{-p, \underbrace{E[F_{s-1}(x+\epsilon) - (x +\epsilon)]}\} + E[\epsilon\}$
The inductive proof then proceeds as follows, the left hand underbraced term inherits the non-increasing character of the right hand side underbraced term.
I am really confused by this part.
1a.) First off I don't think I understand the mathematics of the proof. My take is that if you take expectations you get:
$\underbrace{(F_s(x) - x)} =max\{-p, \underbrace{[F_{s-1}(x) - (x)]}\} + 0\}$
The rhs will be less than the lhs by part 1 (but this is due to the function being non increasing in s) and as you are subtracting x from it, it will get smaller with x (is this all that is this proof is saying?)
1b.) I know from option theory that $F_s(x)$ in increasing in x, it continues to increase before plateauing at a value equal to x. $F_s(x)- x$ is equivalent to being long the call and short the stock, from option theory I know that the maximum value us zero when x is very high and negative otherwise. However this would mean that $F_s(x)- x$ is increasing in x? Where have I gone wrong in my logic?? Moreover, I fail to see what this portfolio has to do with the behavior of an american call in isolation? Can someone please explain the connection, how does the behavior of $F_s(x) - x$ relate to the behavior of $F_s(x)$ as x varies?
2.) To prove the policy the proof then states that "from (ii) and (iii) (although they never show how to prove (iii)) and the fact that $F_s(x) \geq x-p $ it follows that there exits an $a_s$ such that $F_s(x) > x-p$ is $x < a_s$ and equals (x-p) if $x \geq a_s$. It follows from part 1, that $a_s$ is non decreasing in s. The constants $a_s$ is the smallest x for which $F_s(x) = x- p$."
Q2a) This optimal policy seems odd to me, surely if the price declines after conversion you are worse off than you would have been with the option? Indeed standard option theory states that it is never optimal to exercise an American call option prior to expiry.
Q2b.) Maybe is is optimal in the DP framework but I'm not even clear on why this it would be optimal in the DP framework?
Thanks
Baz