Mathematical explanation of how auto-correlation affects least squares estimators

56 Views Asked by At

In a lecture we were investigating graphical methods to check the assumptions of OLS regression. In particular, I was interested by the 'independence of residuals' assumption. After some searching online, I found that having auto-correlation can underestimate or overestimate the least squares estimators. I can't picture this mathematically, and was wondering if it would be possible to shed some light (mathematically) on how this occurs.

1

There are 1 best solutions below

0
On BEST ANSWER

For a simple case, assume $X_i \overset{iid}{\sim}N(\mu,1)$, and we are using least squares to estimate $\mu$.

Then $\hat{\mu} = \min_{\mu}\sum_{i=1}^n\frac{(X_i - \mu)^2}{2}$. In this situation all of the observations are equally weighted.

Taking the derivative with respect to $\mu$ and setting equal to 0 gives us:

\begin{equation} \sum_{i=1}^n (-X_i+\mu) = 0 \Rightarrow -\bar{X}+\mu = 0 \Rightarrow \mu = \bar{X} \end{equation}

Consider instead a situation where $\sigma^2_0 = 1$, but $X_i \sim N(\mu,a^{i-1}\sigma^2_0)$; with $a>0$ a known constant. In this setting the residuals are not independent. The least squares estimator in this context is given by:

\begin{equation} \hat{\mu} = \min_{\mu}\sum_{i=1}^n\frac{(X_i - \mu)^2}{2a^{i-1}} \end{equation}

For $a<1$ this is going to magnify the contribution of $(X_i-\mu)^2$ to the minimization problem for later observations, and for $a>1$ diminish the contribution of $(X_i-\mu)^2$ to the minimization problem for later observations.

Consider $X_1 = 1$ and $X_2 = 3$, then $\bar{X} = 2$. However in the situation where $a \neq 1$, the derivative of the least squares problem, set equal to 0, with respect to $\mu$ would be:

\begin{equation} \begin{split} -1 +\mu + \frac{-3+\mu}{a} & = 0\\ \Leftrightarrow \mu & = \frac{a+3}{a+1}\\ \end{split} \end{equation}

Which approaches $3$ as $a \rightarrow 0$ and approaches $1$ as $a \rightarrow \infty$; with equality to the least squares estimate only when $a=1$.

One way to deal with this is to weight the terms in the least squares summation to account for the unequal variances. You can read more about this here.