Modeling temperature with a Trigonometric function: adding new parameter

58 Views Asked by At

I am trying to model the temperature function using the following equation:

$T(d)=c_0+c_1 \cos (\frac{2\pi}{365} d)$

Where $d$ is the day of the year, and $T(d)$ is the temperature on that day. I am using SVD to solve for $c_0$ and $c_1$, but I want to increase the accuracy of my solution.

Is there a way to add a new parameter $c_2$ to increase the accuracy of the solution?

Here is the data I'm using:

\begin{array}{l|l} \hline \text { January } & 62^{\circ} \mathrm{F} \\ \hline \text { February } & 67^{\circ} \mathrm{F} \\ \hline \text { March } & 73^{\circ} \mathrm{F} \\ \hline \text { April } & 79^{\circ} \mathrm{F} \\ \hline \text { May } & 86^{\circ} \mathrm{F} \\ \hline \text { June } & 91^{\circ} \mathrm{F} \\ \hline \text { July } & 94^{\circ} \mathrm{F} \\ \hline \text { August } & 94^{\circ} \mathrm{F} \\ \hline \text { September } & 89^{\circ} \mathrm{F} \\ \hline \text { October } & 82^{\circ} \mathrm{F} \\ \hline \text { November } & 72^{\circ} \mathrm{F} \\ \hline \text { December } & 65^{\circ} \mathrm{F} \end{array}

2

There are 2 best solutions below

1
On

Regarding whether you can add a parameter, you can do what you like. You could add a million parameters if you so wanted. The only limitation is that a good model has only as many parameters as is appropriate to its data, and hence is not over- or underfitted. But you can essentially always grant yourself higher accuracy by including a new parameters in the right way, while keeping in mind that a model with too many parameters may be overfitted and needlessly complex, and hence inappropriate.

A reasonable approach to your problem would be fitting to the data a sine (or cosine) wave in $4$ parameters, amplitude, frequency, and phase offset, and a constant baseline offset

$$y(t)=A\sin(2\pi f t + \varphi)+c$$

Each parameter is justified since essentially no real, measured physical processes are going to have amplitude exactly $\pm1$ (hence $A$), frequency exactly $1$ (hence $f$), or begin exactly at $t=0$ (hence $\varphi$). (A baseline of $0$ may be reasonable but is not the case here.) If you are assuming your data to repeat exactly annually, then that is equivalent to assuming $f=\frac{1}{365.24}\,\text{days}$ (NB: use the tropical [solar] year for this, not the Gregorian [calendar] year or others.)

And your parameter of $c_2$ is the phase offset. It would be strange to assume that it shouldn't be included, unless your process began exactly at $t=0$.

0
On

Using month instead of day and assuming (!!) $360$ days per year and $30$ days per month, if you want the model to be$$T(m)=c_0+c_1 \cos \left(\frac{ \pi}{6} m+c_2\right)$$ expand the cosine $$T(m)=c_0+c_1\Bigg[ \cos \left(\frac{ \pi}{6} m\right)\cos(c_2)-\sin \left(\frac{ \pi}{6} m\right)\sin(c_2)\Bigg]$$ Define $c_3=c_1\cos(c_2)$ and $c_4=-c_1 \sin(c_2)$ $$T(m)=c_0+c_3\cos \left(\frac{ \pi}{6} m\right)+c_4\sin \left(\frac{ \pi}{6} m\right)$$ Perform the regression and compute $c_2$ from $(c_3,c_4)$ and recompute $c_1$.

With your data, it is quite spectacular.