Hidden Markov Model

225 Views Asked by At

I am reading "Bayesian Reasoning And Machine Learning" and I'm doing exercise 23.3 (a) on p.490.

Here's the exercise:

Consider a HMM with 3 states $(M=3)$ and $2$ output symbols, with a left-to-right state transition matrix

$A = \begin{pmatrix}0.5 & 0 & 0 \\ 0.3 & 0.6 & 0 \\ 0.2 & 0.4 & 1 \end{pmatrix}$

where $A_{i,j}=p(h_{t+1}=i|h_t=j)$, emission matrix $B_{i,j}=p(v_t=i|h_t=j)$

$B=\begin{pmatrix} 0.7 & 0.4 & 0.8 \\ 0.3 & 0.6 & 0.2 \end{pmatrix}$

and initial state probability vector $a=(0.9, 0.1,0)^T$. Given the observed symbol sequence is $v_{1:3}=(1,2,1)$ compute $p(v_{1:3})$.

1

There are 1 best solutions below

6
On BEST ANSWER

Let $\ a_{ij}, b_{jv}\ $ be the entries in row $\ i\ $ and colummn $\ j\ $ of $\ A^T\ $, and row $\ j\ $ and column $\ v\ $ of $\ B^T\ $, respectively, and for $\ v=1,2\ $, let $\ C(v)\ $ the $\ 3\times3\ $ matrix whose entry in row $\ i\ $ and column $\ j\ $ is $\ a_{ij}b_{jv}\ $, and $\ \alpha(v)\ $ the $\ 3\times1\ $ column vector whose $\ i^\text{th}\ $ entry is $ a_ib_{iv}\ $. Then the probability that $\ v_1,v_2,\dots v_m\ $ are the first $\ m\ $ observed outputs is \begin{align} p(v_1,v_2,\dots,v_m)&=\sum_{h_1=1}^3\sum_{h_2=1}^3\dots\sum_{h_m=1}^3a_{h_1}b_{h_1v_1}a_{h_1h_2}b_{h_2v_2}a_{h_2h_3}\dots a_{h_{m-1}h_m}b_{h_mv_m}\\ &=\alpha(v_1)^TC(v_2)C(v_3)\dots C(v_m)\mathbb{1}\ , \end{align} where $\ \mathbb{1}\ $ is the $\ 3\times1\ $ column vector all of whose entries are $\ 1\ $.

For $\ t=2,\dots, m\ $, the row vectors $\ \alpha_t=\alpha(v_1)^T\prod _\limits{i=2}^tC(v_i)\ $ can be computed iteratively by \begin{align} \alpha_2&=a(v_1)^TC(v_2)\\ \alpha_t&=\alpha_{t-1}C(v_t)\ , \end{align} each step of which can be performed by multiplying a row vector by one of the matrices $\ C(v)\ $. The required probability can then be obtained by computing the dot product $\ \alpha_m\mathbb{1}\ $. For your matrices $\ A\ $, and $\ B\ $, you have \begin{align} C(1)&=\pmatrix{0.5\times0.7&0.3\times0.4&0.2\times0.8\\ 0&0.6\times0.4&0.4\times0.8\\ 0&0&1\times0.8}\\ &=\pmatrix{0.35&0.12&0.16\\ 0&0.24&0.32\\0&0&0.8}\ ,\\ C(2)&=\pmatrix{0.5\times0.3&0.3\times0.6&0.2\times0.2\\ 0&0.6\times0.6&0.4\times0.2\\ 0&0&1\times0.2}\\ &=\pmatrix{0.15&0.18&0.04\\ 0&0.36&0.08\\0&0&0.2}\ ,\\ \alpha(1)^T&=\pmatrix{0.9\times0.7&0.1\times0.4&0}\\ &=\pmatrix{0.63&0.04&0}\ ,\text{ and}\\ \alpha(2)^T&=\pmatrix{0.9\times0.3&0.1\times0.6&0}\ . \end{align} The probability that the first $\ 3\ $ observed outputs are $\ v_1=1, v_2=2 $ and $\ v_3=1\ $ is therefore \begin{align} &\alpha(1)^TC(2)C(1)\mathbb{1}\\ &=\pmatrix{0.63&0.04&0}\pmatrix{0.15&0.18&0.04\\ 0&0.36&0.08\\0&0&0.2}\pmatrix{0.35&0.12&0.16\\ 0&0.24&0.32\\0&0&0.8}\pmatrix{1\\1\\1}\\ &=\pmatrix{0.0945&0.128&0.0284}\pmatrix{0.63\\0.56\\0.08}\\ &=0.153823\ . \end{align}