Weighted Least Squares for Parabola Coefficients Estimation

269 Views Asked by At

I am in trouble to find where I am making a mistake...

I have to estimate the parameters a and b of the curve modeled by:

$y = a x^2 + bx$

I have to do that from K measures of the curve, each measure is modeled by:

$y_i = a x^2 + bx + \epsilon_i$

Where $\epsilon_i$ is a Gaussian random variable that follows $N(0,\sigma_i^2)$

After collecting a group of K measures, I start the estimation. In order to do that, I am using weighted least squares for estimating my parameters:

$\Theta = \left[\begin{array}{c} a\\ b \end{array}\right]$

So, the closed solution formula tells me that:

$\hat\Theta = (\Phi^TR^{-1}\Phi)^{-1}\Phi^TR^{-1}y$

Where:

  • y is a $[k \times 1]$ vector with k measures of the curve.

  • R is a $[k \times k]$ matrix constructed from $R = E[\epsilon \epsilon^T] = diag(\sigma_{\epsilon 1}^{2} ... \sigma_{\epsilon k}^{2})$

  • $\Phi$ is a $[k \times 2]$ matrix that is equals to $\left[\begin{array}{cc} x^{2} & x\\ \vdots & \vdots\\ x^{2} & x \end{array}\right]$

The problem is that I always find

$(\Phi^TR^{-1}\Phi)$

as a singular matrix, therefore, I am unable to invert it and get to the final estimatiion.

What am I doing wrong? Have I made a mistake in the construction of the problem?

Thank you!

1

There are 1 best solutions below

0
On BEST ANSWER

The regression model written in matrix notation is given by

$$y=\Phi\Theta+\epsilon.$$

We want to find the least-squares solution $\Theta=\hat{\Theta}$ that maximizes the likelihood of the multivariate Gaussian distributed errors, i.e.

$$L\propto\exp\left(-\frac{1}{2}\epsilon^TR^{-1}\epsilon\right).$$

If we view $R^{-1}$ as the metric of an inner product (since it's positive-definite), we need to minimize $\Vert\epsilon\Vert_{R^{-1}}^2\equiv\epsilon^TR^{-1}\epsilon$ by choosing our $\Theta=\hat{\Theta}$ such that the vector $y-\Phi\hat{\Theta}$ is orthogonal to $\Phi$ under the same metric $R^{-1}$. Therefore,

$$\Phi^TR^{-1}(y-\Phi\hat{\Theta})=0,$$

which leads to the least-squares formula

$$\hat{\Theta}=(\Phi^TR^{-1}\Phi)^{-1}\Phi^TR^{-1}y.$$

So your formula is correct. This formula can even handle correlated errors if you put off-diagonal entries in $R$. The matrix $R$ should be positive-definite. If you find numerically that $\Phi^TR^{-1}\Phi$ is close to singular, use ridge regression

$$\hat{\Theta}_\lambda=(\Phi^TR^{-1}\Phi+\lambda I)^{-1}\Phi^TR^{-1}y,$$

where $\lambda>0$ is the ridge parameter.