How to calculate the expected value of a function of the 'history' of a Markov chain?

410 Views Asked by At

Suppose I have a system which can have two states, 0 and 1, with a transition probability matrix T such as, for example,

[[ 0.8  0.2]
 [ 0.1  0.9]]

Define S(N) as the cumulative sum of the state indices visited after N transitions, given that the initial state is 0. I would like to calculate the expected value of S(N).

For example, $S(2) = 0.8^{2}\times0 + 0.8\times0.2\times1 + 0.2\times0.1\times1+0.2\times0.9\times2 = 0.54$.

What would be a general formula/algorithm to calculate this? I could simply keep track of all the 'branches' of the tree, but their number increases exponentially with N and this doesn't seem like an efficient method.

1

There are 1 best solutions below

0
On

I realized that this can be solved using the recursive relation

$$S(i,n) = \sum_{j} T_{ij}\left[i + S\left(j, n-1\right)\right]$$

where $S(i, n)$ is the expected value of the cumulative sum $S$ given an initial state $i$ after $n$ transitions. This can be implemented using bottom-up dynamic programming. For example, in Python,

import numpy as np

T = np.array([[0.8, 0.2], 
              [0.1, 0.9]])

N = 2
S = np.zeros((2, N+1))
S[0, 0] = 0
S[1, 0] = 1
for n in range(1, N+1):
    for i in range(2):
        S[i, n] = sum(T[i, j] * (i + S[j, n-1]) for j in range(2))
print(S)

which prints

[[ 0.    0.2   0.54]
 [ 1.    1.9   2.73]]

from which we can read off that $S(0, 2) = 0.54$. This can be checked with a Monte Carlo simulation:

def simulate_S(initial_state, N):
    old_state = initial_state
    S = 0
    for n in range(N):
        new_state = np.random.choice(range(2), p=T[old_state, :])
        S += new_state
        old_state = new_state
    return S
N_simulations = 10000
S_estimate = np.mean([simulate_S(initial_state=0, N=2) for _ in range(N_simulations)])
print(S_estimate)

which (for one instance) prints 0.5426, in good agreement with the analytical result.