How to estimate the error of fitting in a simple least squares problem?

29 Views Asked by At

Suppose we have estimated the model parameters $m$ of the equation $y=G*m$ from data $y$ as $m=(G'*G)^{-1} * G' * y$. We have the measurement errors in $y$ from which we construct an error co-variance matrix $C_y$. The error co-variance matrix for $m$ is estimated using the equation $C_m = (G'*C_y*G)^{-1}$. But if we have a known co-variance matrix for $x$, then how to estimate the co-variance matrix for the data? i.e., we have a known $C_m$, then how to estimate $C_y$?

I know that the equation to estimate $C_y$ from $C_m$ is $C_y = G'*C_m*G$. But starting from a known $C_m$, I can first calculate $C_y$ using the above equation and then get back my original $C_m$ using the first equation. But starting from a known $C_y$, if I calculate $C_m$ first and then from it try to get back my original $C_y$, I fail to get the original $C_y$ for a rectangular G. For square G it works. Why is it so?

Is it my test which is flawed or should $C_y$ be estimated some other way?

Numerical Example: Consider the Matlab code for a simple curve fitting problem given below. The equation is $y = 1.5*x - 2*x^2 -0.1 +.73*x^3$. Consider the two tests.

  1. Case 1:

     x = [1:4]'; 
     G = [G = [x, x.^2, ones(length(x),1), x.^3];
     m = [1.5, -2, -0.1, .73]';
    
     %We start with an identity matrix as C_y. ie.,
     C_y = eye(length(x));
     C_m = inv(G*inv(C_y)*G' ;
     C_y_estimate = G'*C_m*G;
     %In this example C_y_estimate is the identity matrix.
    
  2. Case 2:

    x = [1:10]'; %This is the only change from Test 1.
    G = [G = [x, x.^2, ones(length(x),1), x.^3];
    m = [1.5, -2, -0.1, .73]';
    
    %We start with an identity matrix as C_y. ie.,
    C_y = eye(length(x));
    C_m = inv(G*inv(C_y)*G' 
    C_y_estimate = G'*C_m*G
    %In this example C_y_estimate is NOT an identity matrix.