Is there an iterative way to evaluate least squares estimation?

1k Views Asked by At

Suppose to have a set of data $\{y_i, u_i\}_{i=1}^m$, where $y_i \in \mathbb{R}$ and $u_i \in \mathbb{R}^n$.

The claim is that

$$y_i = u_i^\top \theta + \varepsilon$$ where $\theta \in \mathbb{R}^n$ is a vector of unknown parameters and $\varepsilon$ is some kind of zero-average noise (e.g. Gaussian noise).

If I build the followings:

$$Y = \left[\begin{array}{c}y_1\\y_2\\\vdots\\y_m\end{array}\right], U = \left[\begin{array}{c}u_1^\top\\u_2^\top\\\vdots\\u_m^\top\end{array}\right], $$

then I can find a least squares estimation of $\theta$:

$$\hat\theta = (U^\top U)^{-1}U^\top Y.$$

Anyway, suppose that $n$ and $m$ are very huge. Then solving the previous problem numerically is very hard due to memory consumption.

My question is: is there some iterative methods such that I can find $\hat\theta$ iteratively in order to save memory?

I hope that there exists some schema like the following:

$$\begin{align} \hat\theta_1 & = f(y_1, u_1)\\ \hat\theta_2 & = g(\theta_1, y_2, u_2) \\ \hat\theta_3 & = g(\theta_2, y_3, u_3) \\ \vdots\\ \hat\theta_m & = g(\theta_{m-1}, y_m, u_m) = \hat\theta = (U^\top U)^{-1}U^\top Y.\\ \end{align}$$

2

There are 2 best solutions below

3
On BEST ANSWER

Yes, you can apply gradient descent techniques to optimize the square loss :

$$L(\theta) = \sum_i (y_i - u_i^T\theta)^2$$

Compute the gradient $\nabla_\theta L$ and apply the gradient descent method of your choice.

The basic scheme is given by the recursion

$$\theta_{n+1} = \theta_{n} - \sigma \nabla_{\theta_n} L$$

For some stepsize $\sigma$ which should be adjusted. However this is clunky, and second order methods should be prefered, for example Newton's method, or LBFGS if you have a large number of variables and memory is a bottleneck, which seems to be the case.

You could also apply stochastic gradient descent or some variant (ADAGRAD, ADADELTA). There is extensive literature on the subject.

0
On

I recommend checking out the paper "LSMR: An iterative algorithm for sparse least squares problems" by Fong and Saunders.

This algorithm is included in scipy.