How do I solve the weighted normal equations?

3.2k Views Asked by At

I am trying to solve the normal equations for a 3D LSE of a general quadric: $$ z = ax^2 + bx + cxy + dy^2 + ey + f$$

Write as a vector equation: $$ \vec{z}= \bf{X}\vec{\beta}$$ where the 'ith row of X is: $$ Xi = [x^2, x, xy, y^2, y, 1]$$ and $$ \vec{\beta} =[a, b, c, d, e, f]^T$$

In addition I want to weight the points with the following function: $$ w(x,y)=(1-(x^2+y^2)^{1.5}/r^3)^3 $$ if $$ (x^2+y^2) < r^2 $$ 0 otherwise (for some user specified distance r).

According to wikipedia: the normal equations with weights should be: $$ ({\bf{X}}^T \bf{W} \bf{X}) \vec{\beta}= ({\bf{X}}^T \bf{W}) \vec{z} $$ where W is a diagonal matrix with the n w values along the diagonal.

How do I solve this equation in a fast and robust manner?

The wikipage e.g. derives the SVD solution for the non weighted case, but how is this for the weighted case?

1

There are 1 best solutions below

5
On BEST ANSWER

Let me begin by saying that the normal equations are an inefficient method of solving a least squares problem.

Mathematically, the problem which you are trying to solve using the normal equations for $\vec{\beta}$ with no weighting is equivalent to $$ \vec{\beta}=\operatorname{arg min}\limits_b\;\left\|X\vec{b}-\vec{z}\right\|_2^2. $$ Expanding this norm under the usual inner product $\langle x,y\rangle=x^Ty$ gives $$ \left\|X\vec{b}-\vec{z}\right\|_2^2=\vec{b}^TX^TX\vec{b}-2\vec{b}^TX^T\vec{z}+\vec{z}^T\vec{z}. $$ This is a quadratic form in $\vec{b}$, which can be solved by differentiating this expression wrt $\vec{b}$ and solving the resulting system of linear equations: $$ X^TX\vec{b}=X^T\vec{z}, $$ which is the derivation of the normal equations.

If the inner product were a non-standard inner product, such as $\langle x,y\rangle_W=x^TWy$, then following the same derivation, you obtain the weighted normal equations: $$ X^TWX\vec{b}=X^TW\vec{z}. $$

My reason for this description is that the description of the effect of the non-standard inner product. Following back through the steps, it should become plain that this is equivalent to the following problem, using the standard inner product: $$ \vec{\beta}=\operatorname{arg min}\limits_b\;\left\|W^{1/2}\left(X\vec{b}-\vec{z}\right)\right\|_2^2. $$

Now, the QR decomposition can be used to efficiently solve the problem $\min\limits_x\|A\vec{x}-\vec{d}\|$, by factoring $A=QR$, where $Q$ is an orthonormal matrix ($Q^TQ=I$) and $R$ is upper triangular. The linear system becomes $QR\vec{x}=\vec{d},$ which upon left multiplying by $Q^T$ gives $$R\vec{x}=Q^T\vec{d},$$ since $Q^TQ=I$. The solution of this linear system $\vec{x}$ is guaranteed to be the solution which minimises $\|A\vec{x}-\vec{d}\|$.

The relationship of this to your problem is $A=W^{1/2}X$ and $\vec{d}=W^{1/2}\vec{z}$. This is a much more efficient solution method than the normal equations. It is also numerically superior to the normal equations.