Finding original equation from rounded values

79 Views Asked by At

So I have an interesting problem, suppose you have a formula:$$y=2.41\times x+3.85$$ and you use this formula to compute a table of values for $y$, with x ranging from $0$ to $20$, you then round the y values to the nearest $0.5$.

My question is, if you are only given the rounded values for y, and the corresponding values for x, is it possible to recover the original formula used to generate the values?

I figured if you plot the data and calculate a linear regression you'd still end up with the original formula, however this is not the case, when I tried this example problem in excel, the regression was $$y=2.3989x+3.8368$$Which is close but not perfect. So I'm not sure if there is a mathematical approach to this, or even using MATLAB or similar. Or if it's simply not possible.

3

There are 3 best solutions below

0
On

Without a proof, I am going to guess that no, it is not possible.

Since you are rounding, there are an entire family of lines from which the rounded data could be derived from. For instance if your points are (5,1) then the original point could be (4.9,0.9) or (4.89,1.98) etc, etc.

The problem is thus underdetermined.

0
On

Short answer, no, unless you're very lucky.

Look at it this way: if you have a linear relation of the form $y=ax+b$ and plot n points $(x,y)$ satisfying the relation in the x-y plane. Choose even one such point $(x',y')$ (much less all $n$), and "wobble" it by moving the $y'$ coordinate to the closest real number satisfying $y''=m.5$, m an integer, then there are infinitely many possibilities of $y'$ which will result in the same $y''$, hence the operation of 'wobbling' is not invertible.

0
On

As said in previous answers, this does not seem to be possible.

However, you could arrive to something closer if you try to minimize $$F=\int_0^{20}\left(a+bx-\text{Round}\left[\frac{241 x+385}{100},\frac{1}{2}\right] \right)^2\,dx$$ which would correspond to a linear regression based on an infinite number of data points.

This leads to $$F=\frac{20 \left(174243 a^2+3484860 a b-9740256 a+23232400 b^2-125398200 b+169860174\right)}{174243}$$ and, computing $F'_a$ and $F'_b$, this reduces to solving $$348486 a+3484860 b-9740256=0$$ $$348486 a+4646480 b-12539820=0$$ the solutions of which being $$a=\frac{223594}{58081}\approx 3.84969$$ $$b=\frac{699891}{290405}\approx 2.41005$$ Amazing, isn't it ?

Edit

Rounding to the next integer, $$F=\int_0^{20}\left(a+bx-\text{Round}\left[\frac{241 x+385}{100},1\right] \right)^2\,dx$$ would lead to $$a=\frac{223369}{58081}\approx 3.84582$$ $$b=\frac{1400007}{580810}\approx 2.41044$$

Similarly, using $$F=\int_0^{20}\left(a+bx-\text{Round}\left[\frac{241 x+385}{100},2\right] \right)^2\,dx$$ would lead to $$a=\frac{222469}{58081}\approx 3.83032$$ $$b=\frac{1400907}{580810}\approx 2.41199$$

If instead, I use linear regression with a finite number of equally spaced data points for $0 \leq x \leq 20$, I get the following results $$\Delta x=1.0000\implies y=2.40455 x+3.90693$$ $$\Delta x=0.1000\implies y=2.40957 x+3.85456$$ $$\Delta x=0.0100\implies y=2.41010 x+3.84904$$ $$\Delta x=0.0010\implies y=2.41004 x+3.84979$$ $$\Delta x=0.0001\implies y=2.41005 x+3.84970$$