When to discard 1 in binary substraction

897 Views Asked by At

When I want to add -9 + (-4) this is what should be done:

(-9) 1 0111
(-4) 1 1100
------------
1*   1 0011

The 1* should be discarded. Sign bit is 1, so result is negative. 10011 gives us -5 so its correct.

Now lets calculate -21 + (-22):

(-21) 1 01011
(-22) 1 01010
--------------
1*    0 10101

And again, 1* should be discarded. But now, the result is wrong. Sign bit is 0, so the result is not negative, but it should be.

If we treat discarded 1* as a sign bit the result is correct. But why we shouldn't discard 1* this time?

2

There are 2 best solutions below

2
On

Two's complement of $-43$ is $1010101$, so you're calculating things correctly.

You can't fit $43$ into five bits.

0
On

The first example

We will use eight bits, we'll use the first one to keep track of a sign. (Zero = positive numbers, one = negative numbers.) I will call this the sign flag (SF). (Maybe in computer assembly language, "sign bit" might be a more common term? I don't recall offhand.)

So to represent 9, you are using the two's compliment. Since positive nine would be:
SF=0 000 1001

and since transforming to two's complement is a two step process, you start by inverting the sign bit and the main bits, you get:

SF=1 111 0110
and then you add one to get the full two's compliment notation, you add one, so you get:
SF=1 111 0111

So, negative nine, written in two's complement notation, is 111 0111.

This is essentially the same as what you wrote. For negative numbers, leading ones are generally able to be simply ignored in this notation, for much the same reason that we ignore leading zeros in our "normal" notation. So it doesn't matter how many leading ones we have. If we were dealing with 16-bit numbers, it would look like:
SF=1 111 1111 1111 0111

But let's get back to the eight-bit example to keep our numbers looking smaller (and, thereby simpler).

Using the same process, positive 4 would be:
SF=0 000 0100

and negative 4, after performing the two's complement method, would be:

SF=1 111 1100

Now, when we add -9...
S=1 111 0111

to -4...
S=1 111 1100

Binary math (ignoring the flags) shows that 00111 1100 +0111 1100 =1111 0011

So the answer is 1111 0011. We ended up with an answer that has another bit. However, that's fine, because we're dealing with negative numbers in two's complement notation, so we can toss out all the leading ones. Essentially, this is the same as 0011.

So the result ends up looking like this:
SF=1 111 0011

What does that equal? The answer is -13. Check it: Take the two's complement of 1111 0011. So flip the bits to get 0000 1100 and then add one = 0000 1101 which is 13.

Note that we aren't discarding any bits besides leading ones of a negative number.

The second example

Take 21, which is:
SF=0 001 0101

Turn it negative using two's complement. First flip the bits: SF=1 110 1010

then add one:

SF=1 110 1011

Likewise, do so for 22. 22 is:
SF=0 001 0110

Flip the bits:
SF=1 110 1001

and add 1:

SF=1 110 1010

Note that if we ignore the leading ones, the important parts of these numbers are 01011 and 01010. (We can NOT ignore the leading zeros, as we are dealing with two's complement notation.)

So, let's see what happens when we do simple binary arithmetic. -21 = 110 1011 -22 = 110 1010 add =1101 0101

So your answer will be:
SF=1 1101 0101

The result (ignoring leading ones) is 010101. That should be -43. Let's check that. Take -43 = 1101 0101 and flip the bits to 0010 1010 then add 1 to get 0010 1011 which is positive 43.

So this proves that -21 -22 = -43 even when "adding" in two's compliment notation. You can see that two's compliment notation works as expected if you have unlimited bits, just like simple arithmetic works as expected when you can use unlimited leading zeros. The math is straightforward, working, with no complications.

Disarding a 1?

You talk about discarding a one. When you do that, it sounds like you are concerning yourself with some memory limitations. Now is when you may start to notice some problems, which is what I think you're really trying to better understand. This does deviate a bit from the topic of how the math works, and get into the topic of how computers are handling the memory. But since I do have some insight on that, I will also address this for you.

Now, since you're dealing with two's compliment notation, I suspect you may be trying to understand things from a computer "assembly programming language" point of view.

One concept for me to introduce is that your processor may be keeping track of more bits than what you've been taught about so far. In the above examples, the processor's overflow flag bit ("OF" bit) and underflow flag bit ("UF" bit) will have always been zero (indicating zero problems). As that's about to change, I will start showing the values of those bits as well. I also introduce a fictitious "OB" which stands for "Other flag bits", which I show as having a value of x. (x could be zero, or one, or a larger number of zeros or ones.) You can safely ignore the OB portion; I am simply pointing out that there are other flag bits, and there's no reason to generally believe that the overflow flag or underflow flag are stored right by the sign bit. (On the other hand, having a sign bit right next to the remaining numeric bits is common.)

A processor using arithmetic instructions that have enough bits will not have a problem. For instance, with an eight-bit processor (using one bit for a sign, and seven other bits) would take the mathematical results,
OF=0, UF=0, OB=x
SF=1 1101 0101

and toss out enough unnecessary leading ones to make the answer sufficiently small, and end up with:
OF=0, UF=0, OB=x
SF=1 101 0101

This method won't cause any problems for this particular problem as long as there are at least six bits for the numeric portion (not counting the sign bit).

So, what happens if we do exceed the available bits? I could show another example that exceeds seven bits, but to keep things briefer, let's just see how this would be handled by a theoretical processor that had arithmetic instructions that used six bits: one for a sign bit, and five more bits for the numeric value.

However, what would a five-bit processor do? It would see the result of:
OF=0, UF=0, OB=x
SF=1 1101 0101

Then, the processor needs to try to figure out how to cram the sign bit and the arithmetic results into the small amount of memory available for storing the result. The processor would realize that "010101" cannot fit into just five bits. One possible way a processor might handle this is to just use the right-most six bits, with one of those bits going into the sign bit, and the remaining five bits getting stored into the rest of the numerical result. The result would probably look something more like:
OF=0, UF=1, OB=x
SF=0 1 0101

Translated into English, this would look like: positive 21, but don't just blindly trust this answer completely, because there was an underflow condition! As a programmer, you could just ignore the underflow flag, but as you can see, doing so may be at your own peril.

Sometimes "assembly programming language" class instructors may not provide you with the complexities of some error handling until later, so you might not have yet learned about handling the overflow and underflow flags. In short, you should only trust the results if those flags are set to zero. (There are other flag bits that a real processor is likely to have, such as a "carry flag" which might also be relevant. There may also be naming differences: the "underflow flag bit" might be abbreviated "UE" (presumably "underflow error"). Some of the precise details likely vary based on hardware architecture, and possibly some may even vary based on which assembler software is used. A student in such a class may eventually become exposed to some more of those flags, but that continues to get off the topic of how the math works, so I won't elaborate further here.)