I am working on a few linear algebra problems, and I am stuck. I was hoping to get some directions on this site. Before I state my questions, here's the necessary context:
Given a matrix $X$ consisting of three explanatory variables (column vectors) $\vec{x}_1$, $\vec{x}_2$, and $\vec{x}_3$, and a response vector $\vec{y}$, I have made two different versions of linear regression in FreeMat: i) centered variables, and ii) non-centered variables. The scripts are as follows:
There are some "homemade" functions here: sentrer(y)) which gives the centered vector $\vec{y}_s$; and sentrermat(X) which gives the centered matrix $X_s$.
Centered variables:
function [yhats, b] = LSQ0(X,y);
ys = sentrer(y);
Xs = sentrermat(X);
b = inv(Xs.'*Xs)*Xs.'*ys;
yhats = Xs*b;
yhats represents the estimated respons, and b represents the coefficients.
Non-centered variables:
function [yhat, b0, b] = LSQ1(X,y);
[nX,pX] = size(X);
v = ones(nX,1);
Xaug = [v,X];
coef = inv(Xaug.'*Xaug)*Xaug.'*y;
b = coef(2:length(coef),:);
b0 = coef(1,:);
yhat = Xaug*coef;
To get the proper output, I have defined all of the four coefficients ($b_0$, $b_1$, $b_2$, and $b_3$) as coef, and then just withdrawn whatever value I need to get the proper output for $b_0$ and the coefficient vector $\vec{b}$.
Now to the questions:
1) Apparently there is no need to center the respons in the code for LSQ0 (centered variables). I am not able show mathematically why centering is not crucial here. I would appreciate very much if someone would point me in the right direction.
2) The residual errors for the centered model and the non-centered model are defined respectively as $\vec{e}_s = \vec{y}_s - \hat{\vec{y}_s} $ and $\vec{e} = \vec{y} - \hat{\vec{y}} $. Now, apparently these two errors are identical, and I would like to show mathematically why this is so. It is sort of intuitive that the residuals are identical, at least when considering that the centered model deals only with variables from which the corresponding mean has been subtracted. In my head, this means that the "relationships" are all "untouched", and the definitions stated above should be identical. This is sort of vague, though, hehe, and a more robust explanation would be nice. Any pointers here would be appreciated.
3) The total variance $t^2$ can be decomposed into the following relation $$t^2 = e^2 + p^2$$
where $t^2 = <\vec{y}_s,\vec{y}_s>$, $e^2 = <\vec{e},\vec{e}>$ (unexplained variance), and $p^2 = <\hat{\vec{y}_s},\hat{\vec{y}_s}>$ (explained variance). I am to show that this relation always holds, and there is mention of a fundamental geometrical identity in linear algebra that could be used.
I know that the dot product $<\vec{e},\vec{y}_s> = 0 $ due to orthogonality, but I am not sure how to proceed from there, and how that dot product is even relevant.
4) The multiple correlation coefficient is
$$ r^2_{X,y} = \frac{p^2}{t^2} $$
and the correlation between $\vec{y}$ and $\hat{\vec{y}}$ is $r^2_{\vec{y},\hat{\vec{y}}} $
These to correlation coefficients are identical, but I am not able to find out why.
I realize this is a lot of questions, and that some are most likely interconnected. I have spent quite some time on this, and I have become blind on how to proceed. Any pointers would be much appreciated. Do not feel obligated to answer every question; I would not expect anyone to to that.