Confused about notation: difference between $\prod_{i=1}^np(x_i)$ and $\prod_{i=1}^np(x)$

124 Views Asked by At

In my information theory book by Cover and Thomas, at the beginning of the channel coding theorem, it's written:

"Each entry in this matrix" (the matrix of the randomly generated code)

"is generated i.i.d according to p(x). Thus, the probability that we generate a particular code $C$ is "

$Pr(C)=\prod_{w=1}^{2^{nR}} \prod_{i=1}^np(x_i(w))$

Now, I get what they're saying, but the notation is very confusing to me.

$1-$First of all, since we are generating entries i.i.d why can't we simplify $\prod_{i=1}^np(x_i)$ to $\prod_{i=1}^np(x)=p(x)^n?$

$2-$Also, instead of writing $p(x_i(w))$ couldn't we just write $p(x_{i,w})$?

It seems as if there's something about the notation that's crucial that I'm not getting..

Any help would really be appreciated!!

Thanks in advance

1

There are 1 best solutions below

0
On BEST ANSWER

You want to generate $m=2^{nR}$ codewords (indexed by $w=1 \cdots 2^{nR}$, each of which (each row of the matrix in eq 7.61) is a row of $n$ bits: ${\bf x}^{(w)} = (x_1^{(w)},x_2^{(w)}, \cdots x_n^{(w)})$

Each element of this row is indexed by $i=1\cdots n$ Because of the construction, these rows are probabilitically equivalent. And the probability of any particular row realization is (omiting for simplicity the $(m)$ superscript) $p({\bf x})=\prod_{i=1}^n p(x_i)$ (because the elements are also column-wise independent).

I hope this answers why you can't

simplify $\prod_{i=1}^n p(x_i)$ to $ p(x)^n$

Covert & Thomas use $p(x^n)$ for what I wrote $p({\bf x})$, joint probability of the row, not to be confused with $p(x)$. In your tentative formula, $p(x)^n$ is wrong because it has no meaning (what is $x$?)

(The notation has a frequent confusion of many probability textbooks: when one has some formula such as "$ p(X) = \text{blah blah} \, p(Y) \text{blah blah}$" one is supposed to guess if both $p(\cdot)$ represent the same function evaluated at different values of its variable, or rather two different probability functions -which should rather be written as $p_X(\cdot)$ and $p_Y(\cdot)$. This ambiguity is unfortunate, but typical. In the above example, you must understand that in $p(x^n)=\prod_{i=1}^n p(x_i)$ the $p(\cdot)$ on the left side is not the same function as the $p(\cdot)$ on the left side - but these $n$ functions that form the product are, yes, the same function evaluated at $n$ different values of the variable)

Suppose for example that $p(0)=0.1$ and $p(1)=0.9$ and ${\bf x}=(0, 1, 0, 0)$ Then $p({\bf x})=\prod_{i=1}^n p(x_i)= p(0) \, p(1) \, p(0) \, p(0) = 0.1^3 \times 0.9$

Also, instead of writing $p(x_i(w))$ couldn't we just write $p(x_{i,w})$?

Yes. Or $p(x_i^{(w)})$, as I've written above.