Quadratic vs. linear basis for least squares

104 Views Asked by At

For least squares approximation, when is it appropriate to choose a linear basis vs. a quadratic basis?

Linear basis:

$\phi = \begin{bmatrix} 1 & x & y & z \end{bmatrix}$

Quadratic basis: $\phi = \begin{bmatrix} 1 & x & y & z & xy & xz & yz & x^2 & y^2 & z^2 \end{bmatrix}$

1

There are 1 best solutions below

9
On BEST ANSWER
  1. The first ("linear") basis corresponds to a first order Taylor expansion of some $f(x,y,z)$, $$ f(x, y, z) = \beta_0 + \beta_1x + \beta_2y + \beta_3z + \epsilon $$ while the second one - is second order Taylor expansion of $f(x, y, z)$.

$$ f(x, y, z) = \beta_0 + \beta_1x + \beta_2y + \beta_3z + \beta_4 xy + \beta_5 xz + \beta_6 yz + \beta_7 x^2 +\beta_8 y^2 +\beta_9 z^2 + \xi $$

  1. You don't have to use all the "extra" terms in the "quadratic" basis. You can take any subset of them.

  2. Reasons to prefer one over another are mainly theoretical or logical. E.g., assume that $x$ is income and $y$ is country, while $f(x, y, z)$ it is happiness as a function of income, country and something else. By including $x ^ 2$ you assume that there is some maximum level of income $x_0$ that from this point and on - the happiness decreases with income, i.e., $$ \frac{\partial }{\partial x } \mathbb{E}[f|x,y,z] = \beta_1 + 2 \beta_7 x_0 = 0, $$
    i.e., the estimated $x_0$ is $$ \hat{x_0} = - \frac{\hat{\beta}_1}{2\hat{\beta}_7}. $$ A possible reason to include terms of the form $xy$ ("interactions") is when you suspect that income $x$ contributes differently in different countries. I.e., let us say $y$ is $1$ if the subject from Canada and $0$ if he is from the UK. As such, the "contribution" of income in Canada to the level of happiness is $\beta_1 + \beta_4$, while in the UK is only $\beta_1$.

  3. You can start from the quadratic basis and perform variable selection to test whether the linear basis is suffice to your needs.