Least squares fit with a trick

790 Views Asked by At

Fitting a circle with least squares is easy once you get the trick for $c = r^2 - a^2 -b^2$ and you got a linear set of equations.

my problem is as follows:

$$z = \alpha x + \beta y + \alpha \beta$$

given a set of $x,y,z$, how can i fit(mse) the best $\alpha$ and $\beta$ ?

Thanks!

2

There are 2 best solutions below

0
On BEST ANSWER

We'll try to minimize $\|c_1x+c_2y+c_1c_2 z-t\|^2$ (I use the notation of the code below to make it easier to make comparisons). Note that if we had a free variable $c_3$ instead of $c_1c_2$, there would be no problem (at least, for non-degenerate data). So, it is tempting to consider the conditional optimization $$ f(C)=\|c_1x+c_2y+c_3 z-t\|^2=(QC,C)-2(S,C)+\|t\|^2\to\min $$ $$ g(C)=2(c_1c_2-c_3)=(RC,C)-2(T,C)=0\,. $$ The Lagrange multiplier theorem for quadratic forms is extremely nice: every local conditional minimizer is a global minimizer of $f+qg$ for some $q$.

However the global minimization is easy. First, we need to ensure that $Q+qR$ is non-negative definite (otherwise no global minimum exists). In our case, $\det(Q+qR)$ is a quadratic polynomial in $q$, so finding the initial admissible interval $[q-2dq,q+2dq]$ is not a problem. Next, the minimizer $C(q)$ is unique and given by $$ C(q)=(Q+qR)^{-1}(S+qT)\,, $$ provided that the matrix $Q+qR$ is non-degenerate. The minimum itself equals $\|t\|^2-\left((Q+qR)^{-1}(S+qT),(S+qT)\right)=\|t\|^2-F(q)$. Note that this last expression is always a lower bound for the minimum in the original problem, so the true $q$ is easy to recognize: it should minimize $F(q)$ (the true minimal value must be attained). The function $F(q)$ is convex, so its minimizing is a piece of cake (I just used the regular bisection algorithm below). Once we know $q$, we know $C=C(q)$.

It looks neat and clean except for "provided that". It turns out that the Lagrange global minimization problem is degenerate and the algorithm fails exactly if the original problem has multiple solutions. The ugly part of the code handles that case by moving in the direction of the possibly degenerate line to reach the correct ellipsoid surface (I preferred it to the paraboloid because of the stability issues; if there is no degeneracy, there is no real movement either).

import math;

real[] FF(real[] x,real[]y,real[]z,real[]t)
{

real 
xx=dot(x,x),xy=dot(x,y),xz=dot(x,z),
yy=dot(y,y),yz=dot(y,z),zz=dot(z,z),
xt=dot(x,t),yt=dot(y,t),zt=dot(z,t),
tt=dot(t,t);



real[][] 
Q={{xx,xy,xz},{xy,yy,yz},{xz,yz,zz}},
R={{0,1,0},{1,0,0},{0,0,0}},
S={{xt},{yt},{zt}},
T={{0},{0},{1}};

real ddd=xz*yz-xy*zz;
real q=ddd/zz,dq=0.4999999*sqrt(ddd^2+zz*determinant(Q))/zz;


real F(real q)
{
return (transpose(S+q*T)*inverse(Q+q*R)*(S+q*T))[0][0];
}

for (int k=0;k<25;++k)
{
if(F(q+dq)<F(q)) q+=dq; else if(F(q-dq)<F(q)) q-=dq;
dq*=0.5;
}


real[][] ans=inverse(Q+q*R)*(S+q*T);
real a=ans[0][0], b=ans[1][0], c=ans[2][0]; 

real[][] dirr={{(xy+q)*zz-xz*yz},{xz^2-xx*zz},{xx*yz-(xy+q)*xz}}; 

real C=F(q)+(transpose(ans)*Q*ans)[0][0]-2*(a*xt+b*yt+c*zt),
A=(transpose(dirr)*Q*dirr)[0][0], 
B=(transpose(S)*dirr)[0][0]-(transpose(ans)*Q*dirr)[0][0];
B/=A; C=sqrt(max(B^2-C/A,0));
real t1=B-C, t2=B+C, t=(t1^2>t2^2?t2:t1); 

ans+=t*dirr;
a=ans[0][0]; b=ans[1][0]; c=ans[2][0];

return new real[] {a,b,c,a*b-c};
}


real[] x={1,0,0},y={0,1,0},z={0,0,1},t={0,0,5};

real[] FF=FF(x,y,z,t);
write(FF);

real a=FF[0],b=FF[1];
real[] err=a*x+b*y+a*b*z-t;
write(dot(err,err));

pause(); 
7
On

At the outset you want to minimise the expression

$$S(\alpha,\beta)=\sum \delta_i^2=\sum|z_i-(\alpha x_i+\beta y_i+\alpha\beta)|^2=\sum(z_i-\alpha x_i-\beta y_i-\alpha\beta)^2.$$

Differentiate with respect to $\alpha$ and $\beta$ to find the minima:

$$\frac{\partial S}{\partial \alpha}=\sum2(z_i-\alpha x_i-\beta y_i-\alpha\beta)(-x_i-\beta)\overset{!}{=}0,$$

and

$$\frac{\partial S}{\partial \beta}=\sum2(z_i-\alpha x_i-\beta y_i-\alpha\beta)(-y_i-\alpha)\overset{!}{=}0.$$

This yields the normal equations:

$$\sum x_iz_i+\beta\sum z_i=\alpha\sum x_i^2+\beta \sum x_i y_i+2\alpha \beta\sum x_i+\beta^2 \sum y_i+\sum\alpha \beta^2$$

and

$$\sum y_iz_i+\alpha\sum z_i=\alpha\sum x_iy_i+\beta\sum y_i^2+2\alpha \beta \sum y_i+\alpha^2\sum x_i+\sum \alpha^2\beta.$$

Now whether these have a (unique) solution is another matter completely. Why don't you use another constant $c$ rather than $\alpha\beta$?