Why is Deming regression not defined for some sequences? What characterizes them?

58 Views Asked by At

In a nutshell, the purpose of Deming regression is to find a line such that the sum of square errors wrt to the points of a set of points is minimal. For a definition, please refer to Wikipedia. (I'm using $\delta=1$.)

One notices that $\hat{\beta}_0$ is not defined (i.e. no line is found) if $s_{xy}=0$. Surprising for me are the kind of sequences for which $s_{xy}$.

A simple example: If you have a sequence of $n$ points $p_1=(x_1,y_1), p_2=(x_2,y_2),...$, then $y_i=y_j$ means $\bar{y}=y_0$ which implies $(y_i-\bar{y})=0$, so $s_{xy}=0$. This should have been a very easy sequence, right?

Another example with three points is $(1,4),(2,1),(3,4)$: $\bar{x}=2, \bar{y}=3$ therefore $s_{xy}=\frac{1}{2}( (-1\dot{}1)+(0\dot{}-2)+(1\dot{}1) )=0$

1. Why these sequences? What does it mean if $s_{xy}=0$?

I'm using a variant of the form on Wikipedia, because I need the regression line in parametric form $L=\vec{d}t+\vec{c}$. For this I use $\vec{c}=(\bar{x},\bar{y})$ and $\vec{d}=(2*s_{xy},s_{yy}-s_{xx}+\sqrt{(s_{xx}-s_{yy})^2+4s_{xy}^2})$. Thus, if only the divisor of the original definition of $s_{xy}$ is 0, I will get a vector along the y axis, which is fine. However, sometimes also the numerator is zero, for example for $(1,2),(2,1),(3,2)$ or $(8,1),(2,2),(8,3)$.

2. What charecterizes these points?

Update:

Re 2: If $s_{xy}=0$, what remains of the numerator of $\hat{\beta}_2$ is $s_{yy}-s_{xx}+\sqrt{(s_{xx}-s_{yy})^2}=s_{yy}-s_{xx}+|s_{xx}-s_{yy}|$, which is zero iff $s_{xx}\geq s_{yy}$. That means, if $s_{xy}=0$ for some sequence, but the numerator is not, one can make it so by mirroring the points on the first diagonal. Consider question 2 answered

Re 1: One group of point sequences for which $s_{xy}$ is those that are symmetrical on the vertical line through $s_{xx}$ or the horizontal line through $s_{yy}$. However, these are not all. A counter example is $(1,1.5),(2,2),(3,0),(4,1),(5,2)$.

Update 2:

The sequences for which $s_{xy}$ seem to be made up of two groups:

  1. sequences for which the solution would be parallel to one of the axes (such as $(1,0),(2,0),(3,0)$ or $(0,1),(0,2),(0,3)$). These sequences have solutions if you rotate all points
  2. sequences for which no unique solution exists (such as $(1,1),(1,-1),(-1,1),(-1,-1)$) These don't work even after rotation.

What's now missing is a proof...