Least Absolute Deviation (LAD) Line Fitting / Regression

1.3k Views Asked by At

I want to implement robust line fitting over a set of $n$ points $(x_i,y_i)$ by means of the Least Absolute Deviation method, which minimizes the sum

$$\sum_{i=1}^n |y_i-a-bx_i|.$$

As described for instance in Numerical Recipes, you can find the optimal slope, by canceling the function

$$s(b):=\sum_{i=1}^n x_i\text{ sign}(y_i-bx_i-\text{med}_k(y_k-bx_k))$$

where $b$ is the unknown. To find the change of sign, a dichotomic search is - appropriately - recommended. Anyway, the starting search interval as well as the termination accuracy are based on purely statistical arguments (using a $\chi^2$ test).

I was wondering if there is an analytical way to derive safe bounds for the slopes (i.e. values such that $s(b)$ is guaranteed positive or negative), as well as the termination criterion (minimum of $|s(b)|$). (In fact, I don't even know if the function is monotonic.)


Update:

I have gained some insight in the problem.

If you sort the points on $x$ and set $b$ larger than the slope of all segments between two points, the $y_k-bx_k$ will appear in decreasing order, and the median is at index $n/2$. Then $s(b)$ is the difference of the sum of the $n/2$ largest $x_i$ and the sum of the $n/2$ smallest $x_i$, hence it is certainly positive. And larger values of $b$ make no change.

Similarly, setting $b$ smaller than the smallest slope ensures $s(b)$ negative, and this answers my first question. This can work fine when the $x_i$ are regularly spaced, or close to.

Anyway, in the case such that some of the $x_i$ are equal of very close, the slopes can take high values and the dichotomic process based on the full $[b_{min},b_{max}]$ interval might be inefficient. The ordering of the $y_k-bx_k$ will only change when $b$ crosses the value of one of these $n(n-1)/2$ slopes. Hence, the dichotomic process could be based on the (sorted) slopes rather than on the continuous values of $b$.

1

There are 1 best solutions below

3
On

Let's formulate the Sum of Absolute Residuals / Least Absolute Deviation (LAD) / $ {L}_{1} $ Regression / Robust Regression:

$$ \arg \min_{x} {\left\| A x - b \right\|}_{1} $$

There are few options to solve this.

Iteratively Reweighted Least Squares (IRLS)

The main trick in the IRLS approach is that:

$$ {\left\| \boldsymbol{x} \right\|}_{p}^{p} = \sum_{i} {\left| {x}_{i} \right|}^{p - 2} {\left| {x}_{i} \right|}^{2} = \sum_{i} {w}_{i}^{2} {\left| {x}_{i} \right|}^{2} $$

On the other hand, it is known (Weighted Least Squares):

$$ \sum_{i = 1}^{m} {w}_{i} {\left| {A}_{i, :} x - {b}_{i} \right|}^{2} = {\left\| {W}^{\frac{1}{2}} \left( A x - b \right) \right\|}_{2}^{2} $$

The solution is given by:

$$ \arg \min_{x} {\left\| {W}^{\frac{1}{2}} \left( A x - b \right) \right\|}_{2}^{2} = {\left( {A}^{T} W A \right)}^{-1} {A}^{T} W b $$

In the case above, where $ p = 1 $, one could set $ {w}_{i} = {\left| {A}_{i, :} x - {b}_{i} \right|}^{-1} $.
Since the solution depends on $ x $ one could set iterative procedure:

$$ {x}^{k + 1} = {\left( {A}^{T} {W}^{k} A \right)}^{-1} {A}^{T} {W}^{k} b $$

Where $ {W}_{ii}^{k} = {w}_{i}^{k} = {\left| {A}_{i, :} {x}^{k} - {b}_{i} \right|}^{p - 2} = {\left| {A}_{i, :} {x}^{k} - {b}_{i} \right|}^{-1} $.

Sub Gradient Method

The Sub Gradient is given by:

$$ \frac{\partial {\left\| A x - b \right\|}_{1}}{\partial x} = {A}^{T} \operatorname{sgn} \left( A x - b \right) $$

So the solution is given by iterating:

$$ {x}^{k + 1} = {x}^{k} - \alpha \left[ {A}^{T} \operatorname{sgn} \left( A x - b \right) \right] $$

Linear Programming Conversion

This approach convert the problem into Linear Programming problem.

By defining $ {t}_{i} = \left| {A}_{i, :} x - {b}_{i} \right| $ the optimization problem becomes:

$$\begin{align*} \text{minimize} \quad & \boldsymbol{1}^{T} \boldsymbol{t} \\ \text{subject to} \quad & -\boldsymbol{t} \preceq A x -b \preceq \boldsymbol{t} \\ & \boldsymbol{0} \preceq \boldsymbol{t} \end{align*}$$

Which is a Linear Programming form of the problem.

MATLAB Code

I implemented and validated the approaches by comparing it to CVX solution. The MATLAB code is available at my Mathematics StackExchange Question 2603548 Repository.

Remark: Pay attention that the MATLAB code is a vanilla implementation. Real world implementation should take into account numerical issues (Like the inverse of a small number).