Generalization of Shannon's source coding theorem with a posteriori entropies

242 Views Asked by At

This doubt is with reference to section 5-5 of "Information theory and Coding" by Prof. Norman Abramson. Under the topic "A generalization of Shannon's First Theorem", the text discusses how knowledge of the received symbol influences the average codeword length. In particular, it discusses "coding of a source which selects a new set of source statistics after each output symbol."

The a priori source entropy is given by H(A). According to the symbol observed at the output, the statistics of the source changes. With the a posteriori knowledge of the source, we have the a posteriori entropy $H(A|b_{j})$, where $b_{j}$ is the jth possible output symbol.

The text says that: "Since a compact code for one set of source statistics will not, in general, be a compact code for another set of source statistics, we take advantage of our knowledge of $b_{j}$ for each transmitted symbol to construct s binary codes-one for each of the possible received symbols $b_{j}$. When the output of our channel is $b_{j}$, we use the jth binary code to encode the transmitted symbol $a_{i}$."

This is where my doubt lies. The above statement seems to indicate a non-causal system. To my knowledge, coding at the source occurs before reception of the symbol at the output. That is, $a_{i}$ should be transmitted before the corresponding output $b_{j}$ is received. If so, then how can we use the jth binary code to encode the transmitted symbol $a_{i}$ if $a_{i}$ is transmitted before $b_{j}$ is received? Even with some sort of a feedback mechanism, we cannot know ahead of time which output symbol will be received before $a_{i}$ is sent, right? Then how can we code $a_{i}$ accordingly?

The result goes on to say that $\sum_{B} H(A|b_{j})P(b_{j}) \leq \sum_{A,B}P(a_{i},b{j})l_{ij} = \overline{L}$ where \overline{L} is the average number of binits per symbol from the A alphabet. I do not understand the significance of this bound since I cannot imagine a case where we would be able to code the source in a manner as described.