Huffman's Algortihm Probability Distributions

69 Views Asked by At

I am currently studying a bit of coding and information theory. I just learned Huffman's algorithm. I was solving some problems and came up with this one:

Given a code $C=\{0,10,110,111\}$ as a result of Huffman's algorithm construction, what are its probability distributions $P=\{p_1,p_2,p_3,p_4\}$?

My approach is to think of $C$ as a binary tree and analyze it from the bottom up using Huffman's algorithm. $110$ and $111$ have the same ascendant vertex. So, $$p_4\leq p_1,p_2\text{ and }p_3\leq p_1,p_2.$$ Also, $$p_2\leq p_1\text{ and }p_3+p_4\leq p_1.$$ My question(s) is(are): Is $p_3,p_4\leq p_2\leq p_1$ and $p_3+p_4\leq p_1$ enough of an answer? Is this correct? Is there more to explore? In advance, many thanks!

1

There are 1 best solutions below

6
On BEST ANSWER

I believe you're approach and result are both correct.

Edit: In general, if two symbols have Huffman codeword lengths $\ell_i$ and $\ell_j$, then

$$ \ell_i > \ell_j \implies p_i \le p_j$$ $$ \ell_i > \ell_j +1 \implies p_i < p_j$$

while $ \ell_i = \ell_j $ does not imply anything.

This also applies to grouped symbols. That is, if two codewors are equal except for the last bit, so that $\ell_j = \ell_k$, then you can combine them into a new pseudo-symbol $m$ with $\ell_m =\ell_j-1$ and $p_m=p_j+p_k$, and the same relationships above apply.