Linearising a cubic function

2.7k Views Asked by At

I'm not sure if this is the right term, but I want to 'linearise' an equation of the form $y=ax+bx^3$. What I mean is that if I had another function $y=e^x$, then I can plot $\ln(y)$ against $x$ and I would get a straight line. From there, it's easy to determine the y-intercept and slope. How can I do something similar for this cubic, if possible?

1

There are 1 best solutions below

1
On

Your equation $y=ax+bx^3$ is already linear in $a$ and $b$. So you don't need to linearize it. You just find the values of $a$ and $b$ that minimize the residual error

$$\sum_{i=1}^n(y_i-ax_i-bx_i^3)^2,$$

assuming you have measured $n$ data points $(x_i,y_i)$ with unbiased, uncorrelated errors of equal variance. Linear algebra helps find the solution more quickly. Think of $X_1=(x_1,x_2,\ldots,x_n)^T$ as one vector, $X_2=(x_1^3,x_2^3,\ldots,x_n^3)^T$ as another vector, and we wish to use any linear combination $aX_1+bX_2\,$ of them to approximate the vector $\,Y=(y_1,y_2,\ldots,y_n)^T$. The residual error vector $e=Y-aX_1-bX_2\,$ and we want to minimize $\Vert e\Vert^2$. In general we can decompose $Y$ into

$$Y=Y_{\parallel}+Y_{\perp},$$

where $Y_{\parallel}$ is in the 2D subspace spanned by $X_1$ and $X_2$ while $Y_{\perp}$ is in the orthogonal complement space. Then we have

$$\Vert e\Vert^2=\Vert Y_{\parallel}-aX_1-bX_2\Vert^2+\Vert Y_{\perp}\Vert^2,$$

satisfying Pythagoras theorem. To minimize $\Vert e\Vert^2$, we choose $Y_{\parallel}=aX_1+bX_2$, so that $\,e=Y_{\perp}$ is perpendicular to both $X_1$ and $X_2$. This then gives us an equation

$$0=\begin{pmatrix}X_1^T\\X_2^T\end{pmatrix}e=\begin{pmatrix}X_1^T\\X_2^T\end{pmatrix}\left[Y-\begin{pmatrix}X_1\;\;X_2\end{pmatrix}\begin{pmatrix}a\\b\end{pmatrix}\right]\!,$$

which then determines the coefficients

$$\begin{pmatrix}a\\b\end{pmatrix}=\begin{pmatrix}X_1^TX_1\;\;X_1^T X_2\\X_2^TX_1\;\;X_2^TX_2\end{pmatrix}^{-1}\begin{pmatrix}X_1^TY\\X_2^TY\end{pmatrix}=\begin{pmatrix}\sum_ix_i^2\;\;\sum_i x_i^4\\ \sum_i x_i^4\;\;\sum_i x_i^6\end{pmatrix}^{-1}\begin{pmatrix}\sum_ix_iy_i\\ \sum_ix_i^3y_i\end{pmatrix}.$$

There are cases where you need to "linearize" your equation before applying the above formalism. In Logistic regression $\,y=1/[1+e^{-a(x-\mu)}]$, for example, you do $\,\ln\,[y/(1-y)]=ax+b\,$ to find $a$ and $b$. In Gaussian regression you do $\,\ln y=a+bx+cx^2$.