I see three facts on Huffman Algorithms:
Two of them is True:
Fact A: The average length of an optimal code is always at most $\lceil \log n \rceil$, where $n$ is the number of codewords.
Fact B: The average codeword length in Huffman's algorithm is $\Omega(\log n)$.
and one Fact that say this is False:
Fact C: The average codeword length in Huffman's algorithm is $O(\log n)$.
Really I couldn't get the point, what is the difference among these three facts. Is there any example or simple intuitive idea about these three facts? (I means what is the role of "optimal" words in first fact, or what is difference among these three facts in very short sentence)
First fact speaks about average length of optimal code, while second and third speaks about average codeword length of some specific (Huffman's) code.
Given probability distribution on $n$ symbols and some code $c$, average codeword length is just average length of words in code - $\frac{1}{n} \sum |c(i)|$ (expectation wrt uniform distribution), while average length is expectation wrt given probability distribution - $\sum |c(i)| \cdot p_i$.
The first fact, informally, says that if we need to transmit many symbols sampled from some distribution, we well need at most $\lceil \log n\rceil$ bits in average for each symbol if we use good code.
It's also known that Huffman's code is optimal. However, average codeword length can be significantly greater than average length - length of codes of rare symbols don't significantly affect average length, but they do affect average length as much as lengths of codes of frequent symbols.
Fact B says that Huffman's code definitely have "many" "long" codewords (so average length is quite large).
Fact C claims that all words in Huffman's code are "short", but it's not necessary true. For example, if we have symbols with probabilities $\frac{1}{2}$, $\frac{1}{4}$, $\frac{1}{8}$, ..., $\frac{1}{2^n}$, $\frac{1}{2^n}$ - for a total of $n + 1$ symbols - Huffman's code will assign them codewords with lengths $1$, $2$, $3$, ..., $n$, $n$ respectively - so average codeword length will be $\frac{\frac{n(n + 1)}{2} + n}{n + 1} \approx \frac{n}{2} \gg \log n$.
Average length of code will be $\sum\limits_{i=1}^n \frac{i}{2^n} + \frac{n}{2^n} = 2 - 2^{1 - n} = O(\log n)$.