I am trying to understand the derivation of the Viterbi algorithm for hidden Markov models. I understand that the motivation is to find the maximum probability path estimate, i.e.,
\begin{equation} \text{Choose } (\hat{X}_{k})_{k\le n} \text{ such that } \mathbf{P}[X_{k}=\hat{X}_{k} \text{ for all } k\le n] \text{ is maximized}. \end{equation}
We can use this lemma:
Let $H: \mathbb{R}\to [0,\infty)$ be a loss function, $X$ be a real-valued random variable such that $\mathbf{E}[H(X)]<\infty$, and $Y$ be a $(\mathrm{B},\mathcal{B})$-valued random variable. Suppose there exists a $\mathcal{B}$-measurable function $g: B\to \mathbb{R}$ such that \begin{equation*} g(y)=\displaystyle \arg\min_{\hat{x}\in \mathbb{R}} \int H(x-\hat{x})P_{X|Y}(y,dx) \text{ for all } y\in B ' \end{equation*} where $B '\in \mathcal{B}$ such that $\mathbf{P}[Y\in B ']=1$. Then, $g$ minimizes $\mathbf{E}[H(X-f(Y))]$.
Thus, if we choose our loss function to be $H(x)=I_0(x-\hat{x})$, to find the maximum probability path estimate, we choose the functions $f_{k}$ such that \begin{equation*} (f_{0}(y_{0: n}),\ldots,f_{n}(y_{0: n}))= \displaystyle \arg\max_{(\hat{x}_{0: n})} \int \prod_{k=0}^{n} I_{0}(x_{k}-\hat{x}_{k}) P_{X_{0: n}|Y_{0: n}}(y_{0: n}, dx_{0: n}). \end{equation*} However, the product in this last expression confuses me; it seems like the entire expression is simply 0. Is there a reasonable interpretation of why there's a product, or is it just a result of how we formulated the derivation?