Unlike integers, decimal fractions cannot be directly represented in binary. Therefore, what is the procedure to find how many bits to use to express a given binary fraction are sufficient?
Basically I am trying to understand that for a given range of quantity, how many bits are needed at minimum to store it. It is simple for integers, but appears complex for fractions. I am talking about fixed point representation.
I write VHDL so I decide how many I use.
Fractions can be stored in single or double precision format, so the bits used for their representation can vary:
Single (float) precision: needs 32 bits -> 1 bit for the sign, 8 for the exponent and 23 for the fraction part.
Double: needs 64 bits -> 1 bit for the sign, 11 for the exponent and 52 for the fraction part.