Converting $\frac{2}{7}$ to a binary number in a $32$ bit computer

400 Views Asked by At

I want to convert $\frac{2}{7}$ to a binary number in a $32$ bit computer. That is, $1$ bit is assigned to the sign of the number, $8$ bits are assigned to the exponent, and $23$ bits are assigned to the mantissa.

So $x = \pm q \times 2^{m}$ where $\frac{1}{2} \leq q < 1$ (if $x \neq 0$) and $m = e - 127$ is an integer. Suppose the leading binary digit $1$ is shifted just to the left of the binary point. In this case, the representation would be $q = (1.f)_{2}$ and $1 \leq q < 2$. So in effect, the machine has a $24$-bit mantissa.

The binary representation of $\frac{2}{7}$ is $\left ( 0.010 \overline{010} \right )_{2}$. In normalized notation, this is $ \left ( 0.10\overline{010} \right )_{2} \times 2^{-1}$.

I want to write out fully what this number would like in the $32$ bit computer. So, I should write out $24$ bits for the mantissa.

$$x = \left ( 0.\underbrace{10010010010010010010010}_{23 \text{ bits}}\underbrace{\_}_{24\text{'th bit}} \right )_{2} \times 2^{-1}$$

For the $24th$ bit, do I put a $0$? There is not enough room for the entire $3$-period of $\overline{010}$ so what do I do?

2

There are 2 best solutions below

4
On BEST ANSWER

There's no particular reason why your three-period has to be written $010.$ You have multiple choices, depending on where you choose to start looking for a repeating block. In particular, $$ 0.010\overline{010}_2 = 0.01\overline{001}_2 = 0.0\overline{100}_2. $$

And of course even if you did end up with three available bits at the end of the computer word, allowing you to write one copy of your three-period there, you would still have only an approximation, because all the other three-periods (there are infinitely many) don't fit in that space.

So you have to find out how the computer is set up to do rounding of floating-point numbers in this case. Using the default rule for IEEE-754 binary (round to nearest, ties to even--see the other answer), you can start to figure out which way to round (up to $1$ or down to $0$) by looking at the value of the binary digits that don't fit in the IEEE format. The least significant bit in the single-precision representation of $\frac27$ has place value $2^{-25},$ and the bits to the right of it have value $$ 0.10\overline{010} \times 2^{-25}. $$ Since this is greater than $\frac12 \times 2^{-25},$ you round up. (If it were less, you would round down, and if it were exactly equal you would look at the digit with place value $2^{-25}$ to figure out which way the "round to even" rule goes.)

By the way, notice that the IEEE-754 single-precision representation of $\frac27,$ as demonstrated in the other answer, has exponent bits $01111101,$ implying that $$m = e - 127 = 01111101_2 - 127 = -2,$$ not $-1$ as you seem to be assuming when you write, "$x = \pm q \times 2^{m}$ where $\frac{1}{2} \leq q < 1$ (if $x \neq 0$) and $m = e - 127$ is an integer."

3
On

C program:

#include <stdio.h>
int main(int a){
	float f = 2.0/7;
	unsigned int i = *(unsigned int*)&f;
	for(a=31;a>=0;a--){
		printf("%d",(i>>a)&1);
	}
}

Try it online!

This gives 00111110100100100100100100100101:

$$\underbrace{\color{blue}{0}}_{+}\underbrace{\color{green}{01111101}}_{2^{125-127}}\underbrace{\color{red}{00100100100100100100101}}_{\times\color{grey}{1.}00100100100100100100101}$$

So, the mantissa is 1.00100100100100100100101 (first 1 implicit).

$$\cdots010010\underline0\color{grey}{10010\cdots} \mapsto \cdots010010\underline1$$

Rounding to the nearest representable number.


The official standard is behind paywall, so I can only quote a secondary source:

IEEE Standard 754 Floating Point Numbers, Steve Hollasch, 2015 Dec 2:

Algebraic operations covered by IEEE 754, namely + , - , · , / , √ and Binary <-> Decimal Conversion with rare exceptions, must be Correctly Rounded to the precision of the operation’s destination unless the programmer has specified a rounding other than the default. If it does not Overflow, a correctly rounded operation’s error cannot exceed half the gap between adjacent floating-point numbers astride the operation’s ideal ( unrounded ) result.

(Emphasis mine.)

A tertiary source:

IEEE 754, Wikipedia:

The standard defines five rounding rules. [...]

Round to nearest, ties to even – rounds to the nearest value; if the number falls midway it is rounded to the nearest value with an even (zero) least significant bit; this is the default for binary floating-point and the recommended default for decimal.

(Emphasis mine.)