I'll refer to the book by Cover and Thomas "Elements of Information Theory". In particular the definition of discrete channel, at page 183. I think I am making some confusion about what's going on here.
So, basically one wants to be able to transmit a message $W$ from a sender $A$ t a receiver $B$. A channel is a triple $(\mathcal{X}, p(y|X), \mathcal{Y})$ where $\mathcal{X}, \mathcal{Y}$ are a collections of sympols and $p(y|x)$ is a probability matrix such giving the conditional probability of receiving the signal $y$ if the signal $x$ was sent. So far so good. After this he defines the code for a canal $(\mathcal{X}^n, p(y^n|x^n), \mathcal{Y}^n)$ ($n-$tuples of symbols) as:
- A set of indices $A=\{1, \dots, M\}$ , i.e. the message you want to transmit;
- An encoding function $X^n : A \to \mathcal{X}^n$
- A decoding function $g :\mathcal{Y}^n \to A$
Now, he talks about distributions of $X^n, Y^n$ etc. And mathematically speaking, I get it. But I think I am missing the basic idea here. My intuition is that you have a message $W$, you encode it in some DETERMINISTIC way, you transfer it (and here there might be issues/errors/randomness) and you decode it again deterministically. What is the meaning of $X^n (1)$ for instance? You should have that $X^n (1)= x^n (1)$, i.e you are assigning a particular string of n letters from your alphabet $\mathcal{X}$ to the message $1$. But then what does it mean the probability ditribution of $X^n$? Where's the randomness?
Also, in Chapter 5 he talks about compression of data with "source codes", which is basically a mapping from $\mathcal{X}$ to $\mathcal{D}^*$, some strings of another alphabet of symbols (typically $\{0,1\}$). Now, how does in practice this come into play here? Because in the chapter about channels it is not mentioned. If I understood correctly, what happens "in reality" is:
You want to send a message $ W \in A$
You encode the message with $X^n : A \to \mathcal{X}^n$
You compress the encoded message with a deterministic function $C^* : \mathcal{X}^n \to D^*$
You transmit the compressed data through the channel
Decompress and decode the received signal
and maybe in the book the compressing/decompressing part was skipped to make things clearer. Am I right?