Finding a function coefficients from rounded values

88 Views Asked by At

I have a floored quadratic function with unknown coefficients in such form:

$$y=\lfloor a(x+b)^2 \rfloor$$

I also have some (about a thousand) pairs of integer values, for example these:

$\begin{matrix} x & 280 & 281 & 282 & 283 & 284 & 285 & 286 & 287 & 288 & 289 &290 & 291 \\ y & 5026 & 5026 & 5027 & 5027 & 5027 & 5027 & 5028 & 5028 & 5028 & 5028 & 5028 & 5029 \end{matrix}$

How should I approach searching for coefficients $a,b$ ?

I don't have anything near the minimum of $y$, all the values that I have available happen to fall on a rather linear part of the curve.

The original equation was written by human and the correct coefficients almost certainly contain less than 10 significant digits, probably only 3-5. With simpler cases (where $x$'s near $y=0$ were present) it was obvious that I found the intended coefficients by the brevity of those values.

In this case I tried using Excel to put a quadratic trendline through and extract coefficients from the fitted equation but I couldn't manage to find the exact coefficients that would change to next integer at the correct $x$ value.

3

There are 3 best solutions below

3
On

This is by no means a full answer, but these are just my thoughts. Write $f(x) = a(x+b)^2$ to be the original function before you take the floor of its value to give the input, which we will call $g(x) = \lfloor f(x) \rfloor$ .

Perhaps you could try to put it in Excel, but only the points where the value of $y$ changes. I would imagine that all the flat parts is confusing it. For example, you know the real value where $f$ gives $5027$ is between $281$ and $282$. Maybe you could just assign $5027$ to their midpoint since on average it would seem that the change might be in the middle.(?)

Another strategy could be to write $$f(x+1) - f(x) = a(x+b+1)^2 - a(x+b)^2$$ $$ = a (x^2 + 2(b+1)x + (b+1)^2 - x^2 - 2bx - b^2) = a(x + 2b + 1),$$ which grows relatively linearly. And if we had it we could easily find $a$ and $b$. The question is how does $g(x+1) - g(x)$ behave in general and how is it related to $f(x+1) - f(x)$? Because $g$ is not changing much for these very large values and since you say that it seems fairly linear, I would imagine that $a$ is rather small, no matter what $b$ is. I would note that since it seems that $a$ is positive (the number of steps to change from $5027$ is two steps shorter than the number of steps to change from $5028$) and the vertex ($-b$) is less than the area that you are looking at since $g$ is increasing, it seems to me that the best indicator how what $f$ is is at the values of $g$ that are the largest, since there we see the least effect of the rounding down. You might want to try to fit a line to the differences of these values, where you plot a change in the middle of the interval where the jump happened. For instance, for this info, you might include: $(281.5, 5026.5), (285.5, 2027.7), (290.5, 2028.5)$ In fact, for finding $a$ it should not matter if there is a $.5$ or not since the rate of change is not affected by adding $0.5$ to each of the corrdinates on the line. It might matter in the long run though for the quadratic fit for large values of $x$.

If you also look not at neighboring steps but much larger jumps: $f(x + n) - f(x) = a(2nx + 2bn + n^2),$ with an error of $\pm 2$ in the measurement, because of the rounding... but in general the error is likely to be near $1$.

4
On

The method proposed below is not fully justified on the theoretical viewpoint. Nevertheless it seems to give good results in practice.

I suppose that a more thorough study of the statistical theory could improve it.

The data is : $$(x_1\:,\:y_1)\:,\:(x_2\:,\:y_)\:,\:...(x_k\:,\:y_k)\:,\:...(x_n\:,\:y_n)$$ $x_k\,\:y_k$ are integers.

The function to fit is : $$y=\lfloor a(x+b)^2\rfloor$$ $$y\leq a(x+b)^2<y+1$$ $$a(x+b)^2\simeq y+0.5$$ $$Ax+B\simeq \sqrt{y+0.5}\quad \text{with}\quad \begin{cases}A=\sqrt{a}\\B=b\sqrt{a} \end{cases}$$ CALCULUS : $$\begin{bmatrix} A \\ B \end{bmatrix}= \begin{bmatrix} \sum_{k=1}^n x_k^2 & \sum_{k=1}^n x_k \\ \sum_{k=1}^n x_k & n \end{bmatrix}^{-1} \begin{bmatrix} \sum_{k=1}^n x_k \sqrt{y_k+0.5} \\ \sum_{k=1}^n \sqrt{y_k+0.5} \end{bmatrix} $$ $$a=A^2\quad;\quad b=\frac{B}{A}$$ COMPUTED VALUES : $$Y_k=\lfloor a(x+b)^2\rfloor$$ With the data given by the OP, the result is : $$a=0.00000272881371\quad;\quad b=42639.1823$$ The computed $Y_k$ are all equal to the data values $y_k$.

This is probably a lucky case. But this shows anyways that the method isn't bad.

enter image description here

NOTE : The computation has to be carried out with accuracy. The numerical values of computed $a$ and $b$ must have enough digits. If they are rounded, some of the $Y_k$ can be deviated of $\pm$ one unit.

This is shown on Figure 2 where $a$ and $b$ are rounded : One point is deviated, compare to Figure 1.

Even more on Figure 3 where $a$ and $b$ are a bit more badly rounded : Four points are deviated.

enter image description here

2
On

Incredibly too complex but done for the fun of it.

Inspired (once more) by JJacquelin's answer, I worked the minimization of $$SSQ=\sum_{i=1}^{12} \left(a(x_i+b)^2-(y_i+\frac 12) \right)^2$$ The model being linear with respect to $a$, we can express this parameter as a function of $b$ from $SSQ'_a=0$.

Plugging this expression in $SSQ'_b$ and simplifying, we end with the following monstrous polynomial in $b$ $$12127335 b^6-499793526603 b^5-728283224398290 b^4-418674571583641435 b^3-119935913306663810880 b^2-17155664107076603575730 b-980925105997787461048944$$ which, thanks to a CAS, is the product of two factors $$60335b^2+34451352 b+4918659046$$ (no real roots) and $$201 b^4-8398413 b^3-7291542096 b^2-2091027730787 b-199429376345064$$ which has four real roots (expressed with a bunch of radicals). Three of the roots are negative and the decimal representation of the root of interest is $$b=42639.649084838494786403657432884943571843815973595$$ leading to $$a=2.728754367425605034210057290271636640799352020754\times 10^{-6}$$ so close to the numbers reported in JJacquelin's answer and leading, for sure, to the same results.