If $x^{\top}\left(X^{\top} X\right)^{-1} x = (x-\bar X)^T(\sum_{i=1}^n(x_i-\bar X)(x_i-\bar X )^T)^{-1}(x-\bar X)$

48 Views Asked by At

Notation: $x$ is the new observation feature, $X$ is the design matrix from the training data points $x_i, i= 1,..., n$, and $\bar X$ is $\frac{1}{n}\sum_{i=1}^{n} x_i$.

The reason I think it is believable is that in the case of simple straight line regression, we have

$$\operatorname{Var}\left(\widehat{\beta}_0+\widehat{\beta}_1 x_0\right)=\sigma^2\left(\frac{1}{n}+\frac{\left(x_0-\bar{x}\right)^2}{\sum_{i=1}^n\left(x_i-\bar{x}\right)^2}\right)$$

I first run some simulation in R, and found the values are not exactly but fairly close. I'm not sure if the difference is caused by inverse operation in R.

In the progress of verifying this general statement, I want to expand the right hand side and try to show it is equal (or not) to the left hand side, but I stuck at the inverse.

Another thought of proving / disproving is:

Since eventually the question is asking whether Var$(x^T\hat\beta) = \text(Var)((x-\bar X)^T\hat\theta)$, where $\hat \beta = (X^TX)^{-1}X^TY$ and $\hat \theta = ((X-\bar X)^T(X-\bar X))^{-1}(X-\bar X)^TY$, may be I can simplify the problem.

Denoting $\bar X$ as constant $c$, then I want to see if Var$(XZ) = \text{Var}((X-c)(Z-c))$. After some algebra this does not hold.

I also came across the discussion relates to leverage and Mahalanobis distance:

Why uncentered hat matrix can be used to measure the distance from the center of data?

Prove the relation between Mahalanobis distance and Leverage?

However, the subject is leverage, i.e. their $x_i$ are the ones from the training dataset.