In machine learning I often see expressions of the form
$$P(x_n) = \prod_{i=1}^{n} P(x_i | x_{i-1}, x_{i-2}, ..., x_2, x_1)$$
(eg. here, page 2) when modeling sequences (where $P$ is a probability-valued function, ie. a function that maps "things" to probabilities). I don't understand what these expressions mean (anyone care to explain?), but it makes me wonder if the following is true:
$$ P(x_n|x_{n-1}) \cdot \ldots \cdot P(x_3 | x_2) \cdot P(x_2 | x_1) = P(x_n | x_{n-1}, \ldots, x_2, x_1) $$
and why (not)? And, if not, under what conditions is it true?
The first expression is just the recursive application of the definition of conditional probability.
Let $x,y,z$ be random variables (or events?).
By definition, the conditional probability (law? density? cumulative?) $p[x | y]$ (which people call "$x$ given $y$", but I think of as "the probability of $x$ in the $y$-world", or "$x$ under the $y$-lens") is:
$$p[x|y] := {p[x \cap y] \over p[y]}.$$
Now:
$$p[x \cap y] == p[x|y] * p[y]$$ $$p[x \cap y] == p[y|x] * p[x]$$
Therefore:
$$p[x \cap y \cap z] == p[x \cap (y \cap z)] == p[x|y \cap z] * p[y \cap z] == p[x|y \cap z] * p[y|z] * p[z]$$ $$p[x \cap y \cap z] == p[y \cap (z \cap x)] == p[y|z \cap x] * p[z \cap x] == p[y|z \cap x] * p[z|x] * p[x]$$ $$p[x \cap y \cap z] == p[z \cap (x \cap y)] == p[z|x \cap y] * p[x \cap y] == p[z|x \cap y] * p[x|y] * p[y].$$