Doubt about coding and channels in "Elements of Information Theory" by Cover and Thomas

94 Views Asked by At

I'll refer to the book by Cover and Thomas "Elements of Information Theory". In particular the definition of discrete channel, at page 183. I think I am making some confusion about what's going on here.

So, basically one wants to be able to transmit a message $W$ from a sender $A$ t a receiver $B$. A channel is a triple $(\mathcal{X}, p(y|X), \mathcal{Y})$ where $\mathcal{X}, \mathcal{Y}$ are a collections of sympols and $p(y|x)$ is a probability matrix such giving the conditional probability of receiving the signal $y$ if the signal $x$ was sent. So far so good. After this he defines the code for a canal $(\mathcal{X}^n, p(y^n|x^n), \mathcal{Y}^n)$ ($n-$tuples of symbols) as:

  • A set of indices $A=\{1, \dots, M\}$ , i.e. the message you want to transmit;
  • An encoding function $X^n : A \to \mathcal{X}^n$
  • A decoding function $g :\mathcal{Y}^n \to A$

Now, he talks about distributions of $X^n, Y^n$ etc. And mathematically speaking, I get it. But I think I am missing the basic idea here. My intuition is that you have a message $W$, you encode it in some DETERMINISTIC way, you transfer it (and here there might be issues/errors/randomness) and you decode it again deterministically. What is the meaning of $X^n (1)$ for instance? You should have that $X^n (1)= x^n (1)$, i.e you are assigning a particular string of n letters from your alphabet $\mathcal{X}$ to the message $1$. But then what does it mean the probability ditribution of $X^n$? Where's the randomness?

Also, in Chapter 5 he talks about compression of data with "source codes", which is basically a mapping from $\mathcal{X}$ to $\mathcal{D}^*$, some strings of another alphabet of symbols (typically $\{0,1\}$). Now, how does in practice this come into play here? Because in the chapter about channels it is not mentioned. If I understood correctly, what happens "in reality" is:

  • You want to send a message $ W \in A$

  • You encode the message with $X^n : A \to \mathcal{X}^n$

  • You compress the encoded message with a deterministic function $C^* : \mathcal{X}^n \to D^*$

  • You transmit the compressed data through the channel

  • Decompress and decode the received signal

and maybe in the book the compressing/decompressing part was skipped to make things clearer. Am I right?