Word lengths of optimal binary code

1.5k Views Asked by At

Given an optimal binary code (ie the expected word length if as small as possible while the code is still decipherable) with word lengths $s_1, \ldots,s_m$, I'd like to show the following inequalities:

$$m\log m\leq s_1+\ldots+s_m\leq(m^2+m-2)/2$$

The first inequality basically says the average word length (not: expected word length) should be at least $\log m$ which seems very reasonable to me, no matter if the code is optimal or not. But how can I formally proof it?

For the second inequality I need the optimality property I suppose but I don't know how to approach it. Can someone give me a hint please?

1

There are 1 best solutions below

4
On

The lower bound is correct. Looking for the average word length is equivalent to assuming that all code words occur with equal probability. Then, a lower bound on the expected word length (which equals the average word length now) is the entropy of the code. Since all code words are equally probably, you get

$$ \log m \le \frac{1}{m}\sum_{i=1}^m s_i. $$

Similarly, every word can be enumerated by a binary string of length $\lceil \log m\rceil$ (the ceiling operation is used to get integer length). This gives the upper bound

$$ \frac{1}{m}\sum_{i=1}^m s_i \le \lceil \log m\rceil.$$