Why is $0.0180 = 0.0180$ false in MATLAB?

150 Views Asked by At

I am trying to do a small script in MatLab. What it does is load .txt data in to memory. The data comes in a few columns, and I need it to figure out in how many. The data in the .txt will look like this

0.002   -0.224166870117    -0.021419727823  0.288848876953   
0.004   -0.224166870117    -0.021419727823  0.288848876953   
0.006   -0.224166870117    -0.021419727823  0.288848876953   
0.008   -0.174880981445    -0.0369136329737 0.280456542969   
0.01    -0.0822601318359   -0.0530614162946 0.273284912109   
0.012    0.0523986816406   -0.0658726954037 0.26481628418   
0.014    0.165390014648    -0.0715291356038 0.258865356445  
0.016    0.187057495117    -0.0682274548078 0.252838134766   
0.018    0.106491088867    -0.0576325433542 0.245590209961   
0.02    -0.0281677246094   -0.044342847708  0.239562988281  

My script looks like this:

function [dat units]=CheckColumns(filename)
fid=fopen(filename,'r');
tline1 = fgetl(fid); tline2 = fgetl(fid); tline3 = fgetl(fid); tline4 = fgetl(fid);    
tline5 = fgetl(fid); tline6 = fgetl(fid); tline7 = fgetl(fid); tline8 = fgetl(fid);    
data=fscanf(fid,'%f',[1,inf]);
fundet = false;
for i = 1:100 
    if (data(i) == (data(1)*2))
        for p = 2:10
           if  (data(1 + ((i-1)*p)) == (data(1)*(p+1)))
               fundet = true;
           else
               fundet = false;
               break
           end
        end 
        if fundet == true
            count = i
            break
        end
    end
end
fclose(fid);
units=tline6;
dat=count;

first i check for *2 og the first value "0.002" to look for when the next line starts and then know how many columns the data holds. But to be sure its not just colates with the other data, i check another 9 times. It all works fine, right up to the point where i fails (8th check) where 0.0180 = 0.0180 is set as false.. what?! i have tried to run it with other data and got the same mistake (3rd check, 0.3000 = 0.3000 false).

I am quite new to matlab, so i must be overlooking something, but what? Why is 0.0180 = 0.0180 set as false? does it suddenly see it as a string, or?

2

There are 2 best solutions below

10
On BEST ANSWER

This is an important lesson for everyone in programming. When comparing floating point numbers, it is very risky to simply check with an equality (==).

This is due to the fact that computers don't store floating point numbers exactly.

Instead, one way to check that two numbers are equal is to check that they're sufficiently close, e.g. $a = b$ if $$|a-b| < \varepsilon_\mathrm{tol} $$

where $\varepsilon_\mathrm{tol}$ is some small but positive number, indicating the level of tolerance you accept before two numbers are considered equal. For example, you might set $\varepsilon_\mathrm{tol} = 0.000001$.

So instead of writing a == b, write abs(a-b) < 0.000001. You can change the value of $\varepsilon_\mathrm{tol}$ if it is too stringent, for example a very "loose" equality would be abs(a-b)<0.05. For your problem, it is probably fine to set $\varepsilon_{\mathrm{tol}} = 0.000001$ or something like that.

0
On

This is a common trap that many beginner programmers fall into. I disagree that you should be ashamed, unknown unknowns will always get you.

Nearly all programming languages use binary floating point numbers as their default representation for non-integer numbers. This is at least in part a self-fulfilling prophecy, CPU vendors provide what languages demand and languages use what CPU vendors provide.

Unfortunately there are many numbers that can be represented in a finite decimal fraction but that cannot be represented in a finite-length binary fraction, or by extension a floating point binary number (which are a subset of finite-length binary fractions). A finite-length decimal fraction represents can represent numbers of the form.

$$x = \frac{a}{2^b5^c}$$

While a finite-length binary fraction can only represent numbers of the form.

$$x = \frac{a}{2^b}$$

Most programming languages paper over this deficiency by rounding numbers for display. So much of the time nice decimal numbers go in, nice decimal numbers come out. It's easy to be fooled into thinking that you are doing decimal arithmetic.

But you aren't doing decimal arithmetic, you are doing binary arithmetic with a limited number of significant digits. This means that both your initial input of decimal numbers and your arithmetic operations are subject to rounding errors.

Sometimes you get lucky, those rounding errors match up and your comparison says the values are equal, other times you don't.

Taking your example and using the Fraction type in python to examine the actual values of floating point numbers we can build up a picture of what is happening.

0.002 is approximated as $\frac{1152921504606847}{2^{59}} = 0.002 + \frac{3}{125*2^{59}}$

Multiplying that by 9 exactly would produce $\frac{10376293541461623}{2^{59}} = 0.002 + \frac{27}{125*2^{59}}$ . Unfortunately we don't have enough bits to represent that. So our result is rounded to $\frac{1297036692682703}{2^{56}} = 0.018 + \frac{19}{125*2^{56}}$

Meanwhile converting 0.018 to floating point results in a value of $\frac{5188146770730811}{2^{58}} = 0.018 - \frac{49}{125*2^{58}}$


So now we undestand the problem what can we do about it? we basically have two options.

  1. Use something other than floating point math, I don't know what if anyting matlab offers in this regard and whatever is offered is likely to be slower than floating point.
  2. Set an "epsilon" where two values that are "close enough" are considered to be equal. The tricky bit here can be coming up with the correct epsilon, too large and you risk letting things compare equal when they shouldn't. Too small and you risk things failing to compare equal when they should. Remember that as your numbers get larger so do the rounding errors inherent in manipulating them.