If we have a system which can be in 16 possible states, then we need 4 bits of information to encode which of the states the system is in. The entropy of the system is $\log_2 {16}=4.$
Is there anything fundamental about using bits and log base 2? For example, if our method of encoding information allowed us to specify information like $(a,b)$ where $a, b \in \{0,1,2,3\},$ then we would only need two such "quads".
Of course, switching the base of the logarithm only changes the "amount of information" by a constant factor so I imagine most of the math will be the same. Again, is there any reason this base is more "fundamental" and more precisely answers the question "how much information is needed to specify the state?"
There is nothing fundamental about using bits and log base 2, just as there is nothing fundamental about using meters to measure distance, instead of feet. The two situations are quite analogous: using quads, or decimal digits, is a simple change of units, just as when we use feet instead of meters. One meter is 3.28 feet; to convert a measurement in feet to one in meters, we divide by 3.28. The information in one decimal digit is 3.32 as much as in one bit; to convert a measurement in bits to one in decimal digits we divide by 3.32.