From the book Reinforcement Learning: An Introduction page 108
In the final sum I can see where the $0.1$ in front and the $0.9^k$ in the sum come from, but I can't see how the $2^k$ and 2 come from the part above
From the book Reinforcement Learning: An Introduction page 108
In the final sum I can see where the $0.1$ in front and the $0.9^k$ in the sum come from, but I can't see how the $2^k$ and 2 come from the part above
On
It comes from, for example $k=2$,
$$\frac 12 \cdot \frac 12 \cdot \frac 12 \left(\frac{1}{0.5}\frac{1}{0.5}\frac{1}{0.5}\right)^2 = 2^3 = 2^2 \cdot 2$$
And, then, in general,
$$\left(\frac 12 \right)^{k+1} \left[\left(\frac{1}{0.5}\right)^{k+1}\right]^2 = \frac 12 \left(\frac 12 \right)^k 2^{2k+2}= 2^k \cdot 2$$
On
'twere up to me I wouldn't skip steps.
The summation seems to be: $\sum_{k=1}^{\infty} (\frac 12)^k*(0.9)^{k-1}*(0.1)*([\frac 1{0.5}]^k)^2$
Now $\frac 1{0.5} = 2$ so that is
$\sum_{k=1}^{\infty} (\frac 12)^k*(0.9)^{k-1}*(0.1)*2^{2k}$
And $(\frac 12)^k*2^{2k} = 2^k$.
so that is
$\sum_{k=1}^{\infty} (0.9)^{k-1}*(0.1)*2^{k}$
Now we can put the $0.1$ constant in front
$0.1 \sum_{k=1}^{\infty}(0.9)^{k-1}*2^{k}$
Reindex from $0$
$0.1\sum_{k=0}^{\infty}(0.9)^{k}*2^{k+1}=$
$0.1\sum_{k=0}^{\infty}(0.9)^{k}*2^{k}*2$.
And we can combine the $(0.9)^k*2^{k}$ to get $1.8^k$ and bring the constant $2$ in front.
$0.2\sum_{k=0}^{\infty}(1.8)^k$.
....
it's bit strange what other people consider easy to do in ones head.
The $k$-th term in the sum corresponds to the length $k+1$ episode.
Remark that in the length $k+1$ episode the $\frac{1}{2}$ and $\frac{1}{0.5}$ factors both appear $k+1$ times. Therefore, the $k$-th term in the sum contains the following multiplicative factor:
\begin{align} \left(\frac{1}{2}\right)^{k+1} \cdot \left(2^{k+1}\right)^2 &= \frac{1}{2^{k+1}} \cdot 2^{2(k+1)}\\ &= \frac{2^{2(k+1)}}{2^{k+1}}\\ &= 2^{2(k+1)-(k+1)}\\ &= 2^{k+1}\\ &= 2^k \cdot 2 \end{align}