Understanding Asymptotic Equipartition Property

717 Views Asked by At

I have some problems in understanding the precise meaning of the Asymptotic Equipartition Property, related to a large number n of independent and identically distributed random variables (X1, X2, ..., Xn). Here an example is shown (here you may find the complete material):

enter image description here

enter image description here

enter image description here

I do not understand the precise difference between the sentences:

  • "Clearly, it is not true that all 2n sequences of length n have the same probability."

  • "One might be able, however, to predict the probability of the sequence that is actually observed. The question is: what is the probability p(X1,X2,...,Xn) of the outcomes X1,X2,..., Xn, where X1,X2,..., Xn are i.i.d. according to p(x)"

  • "the typical set has probability close to 1, all the elements of the typical set have same probability"

Then I read this sentence, which increases the confusion in my mind. enter image description here

1

There are 1 best solutions below

0
On BEST ANSWER

Clearly, it is not true that all $2^n$ sequences of length n have the same probability.

This can be seen by considering binary iid random variables $X_1, X_2, ..., X_n$ which have a probability of $1$ of $p$ and $0$ being $(1 - p)$. Now sample each one of these random variables to get an $n$ bit binary sequence. There are $2^n$ different binary sequences and they will equiprobable only if $p = (1 - p)$. If, for instance, $p = 1$, then the probability of getting the all one sequence is $1$ and the probability of getting the all zero sequence is $0$.

The question is: what is the probability $p(X_1,X_2,...,X_n)$ of the outcomes $X_1,X_2,..., X_n$, where $X_1,X_2,..., X_n$ are i.i.d. according to $p(x)?$

The Asymptotic Equipartition Property (AEP) proves that i) every sequence in the strong typical set is approximately equally likely and ii) the probability that the observed sequence will be a member of the typical set is tends to $1$ as the number of samples, $n$, tends to infinity. Let $\mathsf{x}^n = x_1x_2...x_n$ be the observed sequence, $a = \min \{p, (1-p)\}$ and $b = \max \{p, (1-p)\}$, AEP implies

$$p(\mathsf{x}^n) = \begin{cases} k \in [a^n , b^n] & \mathsf{x}^n \not\in \text{Typical Set} \\ \approx 2^{-nH} & \mathsf{x}^n \in \text{Typical Set} \end{cases}$$

and $\mathbb{P}(\mathsf{x}^n \in \text{Typical Set}) \approx 1$.

The typical set has probability close to $1$, all the elements of the typical set have same probability

If you want to prove the above fact you have to show that the number of elements in the typical set is approximately $2^{nH}$ and the probability of each of the sequences is $2^{-nH}$. It will then follow that the typical set has probability close to $1$.