how to calculate the probability of observing a sequence using hidden Markov model

338 Views Asked by At

So a hidden markov model consists of hidden states $H_i \in \{1,2,...,n\}, i \in \{1,...,\infty\}$, observable states $O_j \in \{1,...,p\}, j \in \{1,...,\infty \}$, transition probabilities $P(X_i=x|X_{i-1}=y), \forall i\in\{2,...,\infty\}, \forall x,y \in \{1,...,n\}$, emission probabilities $P(O_j=k|H_j=m), \forall k\in\{1,...,p\}, \forall m \in \{1,...,n\} $.

The probability of observing a sequence $x_1,...,x_k$ is $P(x_1,...,x_k)=\sum_{i_1}P(O_1=x_1|H_1=i_1)P(H_1=i_1)\sum_{i_2}P(O_2=x_2|H_2=i_2)P(H_2=i_2|H_1=i_1)...\sum_{i_k}P(O_k=x_k|H_k=i_k)P(H_k=i_k|H_{k-1}=i_{k-1})$

Assumming that we have the transition matrix $T_{i,j}=P(X_k=j|X_{k-1}=i)$ and emission probability matrix $O_{i,j}=P(O=i|H=j)$ and initial distribution $\pi[i]=P(H_1=i)$, we can rewrite the above as:

$P(x_1,...,x_k)=\sum_{i_1}\pi_{i_1}O_{x_{1,i_1}}\sum_{i_2}T_{i_1,i_2}O_{x_2,i_2}...\sum_{i_k}O_{x_k,i_k}T_{i_{k-1},i_k}$.

Now how do we get this as matrix product?

2

There are 2 best solutions below

2
On BEST ANSWER

If $\ A(x_2),A(x_3), \dots,A(x_k)\ $ is an arbitrary sequence of square matrices of the same dimensions, $\ b(x_1)\ $ and $\ \mathbb{1}\ $ column vectors of the same dimension as the rows and columns of the $\ A(x_i)\ $, with all entries of $\ \mathbb{1}\ $ being $1$, and $\ p=$$ b(x_1)^TA(x_2)$$A(x_3)\dots$$A(x_k)\mathbb{1}\ $, then $$ p=\sum_{i_1}b_{i_1}(x_1) \sum_{i_2} A_{i_1i_2}(x_2) \sum_{i_3}A_{i_2i_3}(x_3)\dots \sum_{i_k}A_{i_{k-1}i_k}(x_k)\ . $$ Comparing this with your expression for $\ P(x_1, x_2, \dots,x_k)\ $, we see that if we put $\ b_i(x_1)=\pi_iO_{x_1i}\ $ and $\ A_{ij}(x_n)=$$ T_{ij} O_{x_nj}\ $, then we have $$ P(x_1, x_2, \dots,x_k)= b(x_1)^TA(x_2)A(x_3)\dots A(x_k) \mathbb{1}\ . $$

0
On

I will not give the proof (with any notation is awful), but I'll introduce the "standard" notation and the expression.

Let $\{X_t\}$ be the observed chain and $\{C_t\}$ the hidden $m$-states chain. Define $p_{i}(x)=P(X_t=x|C_t=i)$ (the chain is homogeneous in $t$). The diagonal matrix $P(x)$ is defined as $$ P(x)=\text{diag}(p_1(x),...,p_m(x)). $$ Let $\Gamma$ be the $m\times m$ transition probability matrix of $\{C_t\}$. If $\delta$ is the initial probability row vector of $\{C_t\}$, then $$ P(X_1=x_1,...,X_T=x_T)=\delta P(x_1)\Gamma P(x_2)\Gamma P(x_3)\cdots \Gamma P(x_T)e^{\top}, $$ where $e$ is the row vector of ones.

I took this from the book "Hidden Markov models for time series" 2nd ed. by Zucchini, et. al. (Proposition 1).