Expected code length Proof

601 Views Asked by At

I need some help with the following proof:

Let $C$ be an optimal code with alphabet $A=$ {$a_1, ..., a_n$}. Show that the inequality $E(C) \le \lceil \log_2(n) \rceil$ holds. $E(x)$ is the expected codeword length.

What I´ve got so far:

If $n=2^m$ with $m \in \mathbb{N}$ and $|c_i| = m$ for all $c_i \in C$, we got equality, since $E(C) = m = \lceil \log_2(2^m) \rceil = \log_2(2^m) = m \log_2(2) = m$.

For me, it is not clear why all codewords of an optimal code with $2^n$ codewords, seems to have the same length.

2

There are 2 best solutions below

0
On BEST ANSWER

Let $ m = \lceil \log_2(n)\rceil$.

And let $C_0$ be the "trivial" binary code, where $c_i$ is the $m-$length binary representation of the number $i-1$ (we substract $1$ so that we start with zero).

Because $m\ge \log_2(n) \implies n \le 2^m$, this works: we have enough binary $m-$tuples to code all the $n$ symbols, with no repetitions. And, of course, the code is instantaneous, with length $E(C_0) = m$.

Because $C$ is optimal, its average length must be better or equal than $C_0$, hence $E(C) \le E(C_0)=\lceil \log_2(n)\rceil$

3
On

See first three slides here: https://cs.nyu.edu/~roweis/csc310-2005/notes/lec4x.pdf

For any coding scheme, the average code length is at least entropy, and by Shannon's random coding argument, there is a code achieving this. Thus, your expected length is simply the entropy of the source.

Finally, using the fact that if $X$ is a source supported on alphabet $\{x_1,\dots,x_n\}$, then the entropy of $X$ is upper bounded by $\log_2 n$, with equality holding for uniform distribution, we conclude.