The following example, although somewhat unrealistic, is a case in which exact matching to a noisy channel is possible. There are two channel symbols, 0 and 1, and the noise affects them in blocks of seven symbols. A block of seven is either transmitted without error, or exactly one symbol of the seven is incorrect. These eight possibilities are equally likely. Thus we have
\begin{align} C(\text{capacity})&=\max(H(y)-H_x(y))\\ &=\frac{1}{7}\left(7+\frac{8}{8}\log_2\frac{1}{8}\right) \end{align} I didn't understand the solution. What I can observe is that the 'Transmission Possibilities':
- 0000000 (No errors)
- 1000000 (1st symbol error)
- 0100000 (2nd symbol error)
- 0010000 (3rd symbol error)
- 0001000 (4th symbol error)
- 0000100 (5th symbol error)
- 0000010 (6th symbol error)
- 0000001 (7th symbol error)
Therefore,
\begin{align*} H(Y|X) &= -\sum_{i=1}^{2} \sum_{j=1}^{8} P(x_i, y_j) \cdot \log_2(P(y_j|x_i)) \\ &= -\left(\frac{1}{2} \sum_{j=1}^{8} \frac{1}{8} \log_2\frac{1}{8} + \frac{1}{2} \sum_{j=1}^{8} \frac{1}{8} \log_2\frac{1}{8}\right) \\ &= -\frac{1}{2} \sum_{j=1}^{8} \frac{1}{8} \log_2\frac{1}{8} - \frac{1}{2} \sum_{j=1}^{8} \frac{1}{8} \log_2\frac{1}{8} \\ &= -4 \cdot \left(\frac{1}{8} \log_2\frac{1}{8}\right) - 4 \cdot \left(\frac{1}{8} \log_2\frac{1}{8}\right) \\ &= -8 \cdot \left(\frac{1}{8} \log_2\frac{1}{8}\right) \\ &= -\log_2 \frac{1}{8} \\ &= 3 \text{ bit} \end{align*} Can someone explain the rest of the part of solution. Thanks beforehand!
Let $X^n$ and $Y^n$ denote the input and output blocks of length $n=7$. Consider them as macro-symbols, which correspond to the inputs and output of our channel. In this setting
$$I(X^n;Y^n)=H(Y^n)-H(Y^n|X^n)$$
with $H(Y^n|X^n)=H(X^n+Z^n|X^n)=H(Z^n|X^n)=H(Z^n)=3$ bits, where $Z^n$ is the error pattern.
Hence $H(Y^n|X^n)$ does not depend on the input distribution, and we are left with maximizing $H(Y^n)$. We have the bound $H(Y^n) \le 7$ which corresponds to a uniform distribution for $Y^n$. The question is if there is an input distribution that gives that output distribution. And there is: a uniform input. Because the output $Y^n=X^n+Z^n$ implies $X^n=Y^n+Z^n$, and for each output are eight possible inputs and pattern errors.
Hence $C = \max I(X^n;Y^n) = 7 - 3 = 4$ bits.
In this (very) particular case we can assert that this is correct by finding an ideal coding: use the 16 codewords from a Hamming $(7,4)$ code (which can correct up to 1 error). In this way, you can trasmit 4 bits of information in each block without error.
Now, the above is the capacity of the "block" channel: $4$ bits per $n-$block. To get the capacity of the original channel we need to divide by $7$. Hence we get $C = 4/7$ bits per channel use (binary transmission).