Interpreting Password Entropy Calculation – Property of Character Entropy

217 Views Asked by At

I was reading this explanation on how to calculate the entropy of a password. The article is great and it explains it very succinctly that even I understood it.

According to the site, if you have a password that has only lower-case characters, you will have a pool of 26 possible characters from the English alphabet. Paraphrasing it further:

Entropy is calculated by using the formula $\log_2 x$, where $x$ is the pool of characters used in the password. So a password using lowercase characters would be represented as $\log_2 26 \approx 4.7$ bits of entropy per character.

If I remember correctly, this logarithmic expression can be algebraically expressed as $2^x=26$: $x$ being the ‘4.7 bits of entropy per character’. Why? What is the property that makes the value to which the base is powered to be the entropy of a character?

2

There are 2 best solutions below

0
On BEST ANSWER

From the comments above:

In binary, you only have two bits; thus encoding a character which can take 26 values requires $\lceil\log_2 26\rceil$ bits, since the number of possible different elements encoded using $b$ bits is $2^b$, and you need this to be at least $26$ to cover all possible distinct 26 characters. This explains why the $\log_2$ shows up here: it is to cover the space of all possible elements (if the alphabet had $a$ different characters instead of the $2$ bits of binary, it'd be $\log_a 26$).

0
On

The "entropy" is a measure of the amount of information in a message (the name comes from many of the formulas having an uncanny resemblance to the ones in statistical mechanics for entropy). And customarily "information contents" is measured in bits (2 options), thus the "power of 2" and "logarithm of base 2" which show up all over the place. If martians measure information in trits (3 options), they'll have formulae with 3 all over.