I’m trying to come up with a simple general formula for a product of sliding conditional probabilities in a sequence of terms.
Take a sequence s consisting of 4 terms:
\begin{equation} s = x_1, x_2, x_3, x_4 \end{equation}
To calculate the product of sliding conditional probabilities of sequence $s$, you condition each term in the sequence on the preceding terms and then take the product of all these conditional probabilities. For sequence s the product is:
\begin{equation} P(x_1) \times P(x_2|x_1) \times P(x_3|x_1, x_2) \times P(x_4|x_1, x_2, x_3) \end{equation}
The point where I'm struggling is coming up with a general formula for a sequence $S$ of $n$ terms. The equation I came up with (shown below) has an issue.
\begin{equation} \prod_{i=1}^{n} P(x_i|x_{i-1}, x_{i-2}, ..., x_{k}) \end{equation}
where:
- $n$ is the number of terms in $S$ and
- $k = i-1$ and denotes the size of the conditioning environment
The first issue is that k seems to spring out of nowhere. Ideally, I would like to come up with a function $f(n_S,k_S)$ that yields the product of conditional probabilities for sequence S of $n$ terms. How how do I define such a function?
This brings to the second, more problematic, issue, namely that if $k = i - 1$ then $P(x_i|x_{i-1}, x_{i-2}, ..., x_{k})$ because it makes the conditioning context fold back upon itself. Where can I specify $k$ in a way that specifies the number of terms in the conditioning context. Would using recursion help?
It sounds like you're trying to formulate a generalized chain rule: $$ \Pr \left(\bigcap_{i=1}^n x_i \right) = \prod_{i=1}^n \Pr \left( x_i ~~\middle|~~ \bigcap_{i=1}^{n-1} x_i \right) $$
which is equivalent to:
$$ \Pr(x_1, x_2, \ldots, x_n) = \prod_{i=1}^n \Pr(x_i \mid x_1, x_2, \dots, x_{i-1}) $$
which is equivalent to:
$$ \Pr(x_1) \times \Pr(x_2 \mid x_1) \times \Pr(x_3 \mid x_1, x_2) \times \cdots \times \Pr(x_n \mid x_1, x_2, \ldots, x_{n-1}) $$