This is probably fairly obvious, but I'm seeking clarification on something I came across regarding ordinary least squares fitting in 3d.
If by minimizing the sum $|ax_i+by_i+c-z_i|^2$ we can construct the plane $f(x,y)=ax+by+c$ so that the sum of squared distances of our points from this plane is the smallest, what do we get by conversely looking for planes $g(x,z)$ and $h(y,z)$?
Just by guessing, I'd think they should represent the same plane, but if I make up some points and try to fit them (using $A^\dagger Ax=A^\dagger b$), it's clear that they don't. In cases when the points are almost coplanar curiously I get planes that are very similar to each other, but not identical.
So I hope I'm making myself sufficiently understood, the question is if the plane $f(x,y)$ is the least squares fit, how to interpret planes $g(x,z)$ and $h(y,z)$? What is being minimized? Thanks for any help.
The plane $f(x,y)=ax+by+c$ is a model that is a plane that must meet the $z$-axis. The squared distance between this plane and the data points given by $$ |f(x_i,y_i) - z_i|^2 $$ measures the distance between the plane and the $i^\text{th}$ point in the $z$-direction.
$$ |g(x_i,z_i) - y_i|^2 $$ measures the distance between the plane and the $i^\text{th}$ point in the $y$-direction.
$$ |h(y_i,z_i) - x_i|^2 $$ measures the distance between the plane and the $i^\text{th}$ point in the $x$-direction.
As you have discovered, varying the direction along which the "error" between the data and the model is measured changes the total "error", and so alters the model fit.