Line with minimum average distance to a set of points in $\mathbb{R}^2$

826 Views Asked by At

I'm trying to solve the next problem (this is just curiosity):

Let $P_1=(x_1,y_1),\dots,P_n=(x_n,y_n)$ be a set of known different points in $\mathbb{R}^2$ for $n\geq 1$. Find $m$ and $b$ (not necessarily unique) such that the average of the (perpendicular) distances between the line $y=mx+b$ and every point is minimum.

My first attempt was to construct the average of the distances from the line to the points, and i obtained

$$\overline d(m,b)=\sum_{i=1}^n\frac{|mx_i-y_i+b|}{n\sqrt{m^2+1}}$$ The idea was to see this as a regular calculus 2-variables function and use the standard optimization method. But i got stuck when i tried to compute the partial derivatives of such an expression. The main problem with this approach is that to get rid of the absolute value we need to know somehow the relative position of the points to the line, and this is something i want to avoid using as hypothesis.

If $n=1$ it is obvious that any line going through $P_1$ will do the work. For $n=2$ i think any line perpendicular to the segment $P_1P_2$ is a good choice (but I'm still working on that proof). From there on it is a lot more complicated, as expected.

Any ideas on how to approach this problem?

3

There are 3 best solutions below

1
On

As you wrote, from a formal point of view, the problem seems to be very difficult (not to say more !).

One of the key problems being the initial estimates for $m$ and $b$, what I should try in a first step is to minimize withe respect to $m$ and $b$ $$ f(m,b)=\sum_{i=1}^n\frac{(mx_i-y_i+b)^2}{m^2+1}$$ Computing the partial derivatives and setting them equal to $0$, we can, from $f_b(m,b)$, express $b$ as a linear function of $m$ and then $f_m(m,b)=0$ reduces to a quadratic equation in $m$. The solution I should keep will be the one of the same sign as the $m$ from the linear regression.

From here, hoping that you have a robust optimization algorithm, we can hope to get the minimum of $$\overline d(m,b)=\sum_{i=1}^n\frac{|mx_i-y_i+b|}{\sqrt{m^2+1}}$$

I tried using the following data points $$\left( \begin{array}{cc} x & y \\ 1 & 4 \\ 2 & 8 \\ 3 & 11 \\ 4 & 10 \\ 5 & 13 \end{array} \right)$$ The preliminary step leads to $$m=\frac{1}{25} \left(23+\sqrt{1154}\right)\approx 2.27882 \qquad b=\frac{1}{25} \left(161-3 \sqrt{1154}\right)\approx 2.36353$$ to which would correspond $ \overline d(m,b)\approx 2.31363$.

The second step leads to the final result $m=2.25$, $b=1.75$ to which corresponds $ \overline d(m,b)\approx 2.03069$.

2
On

Maybe the following helps:

Consider the corresponding one-dimensional problem. You are given a finite set $\{t_1,\ldots,t_n\}$ of points in ${\mathbb R}$ and want the point $\tau$ minimizing $\sum_{i=1}^n |t_i-\tau|$. If $t_1<t_2<\ldots t_n$ then this point $\tau$ is the middle point on this list (or any value between the two central points on this list), independently of the values $t_i$ otherwise.

In the two-dimensional setting you can argue as follows: Assume that a unit vector $u=(\cos\phi,\sin\phi)$ is given, and you want the best line orthogonal to $u$. You then should compute all scalar products $t_i:=(x_i,y_i)\cdot u=x_i \cos\phi + y_i \sin\phi$ and determine the $\tau$ minimizing $\sum_{i=1}^n |t_i-\tau|$. The optimal line orthogonal to $u$ then has the equation $x\cos\phi +y\sin\phi=\tau$, and the length sum for this $u$, resp. $\phi$, is $f(\phi)=\sum_{i=1}^n|t_i-\tau|$.

In this way your problem can be reduced to studying the function $\phi\mapsto f(\phi)$ $\>(0\leq\phi<\pi)$.

3
On

The difficulty of your problem is that you try to minimize with respect to a sum of absolute values. If your problem was "just" to minimize with respect to the sum of squared values it would be much easier with a closed form expression.

The problem in your form requires, however, an algorithm. The good news is that you can find the exact solution with such an algorithm as your problem is convex. I can figure out two approaches to solve your problem:

  • Plug your problem as it is in a solver such as CVX to obtain the exact solution.
  • Rewrite your problem into a linear programming setting - as stated below- and solve this problem. This approach is attractive since such problems are can be efficiently solved. The linear programming setting is $$ \min_{m,b,t_i, i=1:n} \sum_{i=1}^{n} t_i, {\rm subject~ to}~~ \\t_i\geq 0~ \forall~i \\ -t_i \leq mx_i-y_i+b \leq t_i ~ \forall~i $$
    Here, the absolute value has been "replaced" by using the variables $t_i$