Why do 1000.5, 1/16 and 1.5/32 have an exact representation in an arbitrary (finite) normalized binary floating point number system but 123.4, 0.025 and 1/10 don't? How can this easily been seen without trying to create the complete floint point number?
Exact representation of floating point numbers
4.3k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 4 best solutions below
On
The numbers that can be represented with a finite binary floating point representation are called the dyadic rationals. They comprise all numbers that can be represented in the form $\frac{i}{2^j}$ where $i$ and $j$ are integers and $j \ge 0$. $123.4 = \frac{1234}{100} = \frac{617}{50}$ etc. cannot be represented in this form.
On
Another way of stating what the other answers have already is that the numbers with an exact floating point representation have a terminating decimal representation in base $2$.
So, $1.5 = 1.1_2$, and $1.875 = 1.111_2$, but $1/10 = 0.00011001100110011 ... _2$.
On
Besides the number being a dyadic rational, it is necessary that the binary representation use no more bits than the floating point mantissa. $1+2^{-64}$ has a denominator that is a power of $2$ but will require $65$ bits to represent it. Unless you are using words longer than $64$ bits this will be represented as exactly $1$. All your examples have relatively short binary representations, the longest being $1000.5_{10}=1111101000.1_2,$ requiring $11$ bits. Old $32$ bit floating point numbers could store about $24$ bits of mantissa. The standard for $64$ bits floating point number is $53$ bits.
Written as fractions in lowest terms, the denominator is a power of $2$ for those having a finite binary representation
So
while
all having non-powers of $2$ in the denominator.
By comparison, for decimal fractions to have a finite representation, the denominator of the lowest terms fraction should be a a power of $2$ times a power of $5$ since the the prime factorisation of $10$ is $2 \times 5$