How to model the arc tangent like function in a high school way, given the points on its graph?

300 Views Asked by At

I'm trying to come up with a regression function for a student of mine that best fits the the attached graph, but using a method for high school students, who're only familiar with trigonometry and calculus of one variable.

To be precise, they're given $n=30$ pair of points $(x_i,y_i)$ (which I'm not write here, but they do have them!) and they're also given the attached graph that the function generates. They're asked to model the function.

If it was not high school, then I'd have modeled it like:

$$y= f(x):= a \hspace{1mm} arctan (bx + c) + d, $$ and I'd have tried the minimize the sum of squared errors $\sum_{i=1}^{n} (y_i - f(x_i))^2$, and would've minimized it w.r.t. $a,b,c,d$, form the normal equations and either analytically or numerically solve them.

But for high school, I can't do all that - they've barely seen regression and they don't know partial derivatives, so I'm not completely sure how to proceed here. Any help appreciated!

enter image description here

2

There are 2 best solutions below

2
On

The model being $$y(x)=a \tan ^{-1}(b x+c)+d$$

Looking at the graph, a few approximations can be visually done

  • $$y(0)=a \tan ^{-1}(c)+d \approx 0$$

  • $$y(\infty)=a \frac \pi 2 +d \approx 4000$$

  • Around $x=6$, there is an inflection point; so $c+6b \approx 0$

Now, let them play with $c$ by trial and error.

0
On

For a simpler approach in the particular case of the figure joint to the question, see the answer from Claude Leibovici.

The answer below can be applied in a more general case : $$Y=A\:\arctan(B\:X+C)+D$$ is equivalent to : $$X=\frac{1}{B}\tan\left(\frac{1}{A}Y-\frac{D}{A}\right)-\frac{C}{B}$$ With change of notations :$\quad\begin{cases} a=\frac{1}{B}\\ b=\frac{1}{A}\\ c=-\frac{D}{A}\\ d=-\frac{C}{B}\\ y=X\\ x=Y \end{cases}\quad$ the equation is equivalent to : $$y=a\tan(bx+c)+d$$

The usual way is to use a nonlinear regression method of fitting. They are several softwares. The calculus is iterative starting from "guessed" values of the parameters. Sometimes the numerical calculus fails due to bad initial guessed values and/or not convergent iteration.

The principle of a non conventional method (no iteration, no initial guess) is explained in this paper : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales , partially translated in : https://scikit-guess.readthedocs.io/en/latest/appendices/references.html. An application to the case of tangent function is shown below.

Thanks to this method one can compute adjusted values of the parameters $a,b,c,d$ and then $A,B,C,D$.

enter image description here

enter image description here

If a well defined criteria for fitting is specified ( least mean absolute deviation or least relative deviation or others) the above method is not sufficient. An iterative process is required with guessed values to start. The above calculus can be used to get a set of good initial values.

CAUTION :

In the above method numerical integrations are carried out for the computation of $S_k$ and $T_k$. Such a simplified process cannot be used if the function tends to infinity around a point on the range of the data $(x_k\, ,\,y_k)$. For example if the points are distributed as on the next figure the above method will not be convenient.

enter image description here

FOR INFORMATION :

An integral equation to which $\quad y(x)=a\tan(bx+c)+d\quad$ is solution is : $$y(x)=\frac{b}{a}\int (y(x))^2dx-2\frac{bd}{a}\int y(x)\,dx+\left(\frac{b\,d^2}{a}+ab\right)x+\text{constant}$$ This allows the linearisation of the regression and the construction of the first 4x4 matrix.