The Sub Gradient of One Dimensional Least Absolute Difference (LAD)

242 Views Asked by At

I am trying to solve the following one-dimensional least absolute difference (LAD) optimization problem

1D LAD

and I am using bisection method to find the best beta (a scalar). I have the following code:

import numpy as np
x = np.random.randn(10)
y = np.sign(np.random.randn(10))  + np.random.randn(10)

def find_best(x, y):
    best_obj = 1000000
    for beta in np.linspace(-100,100,1000000):
        if np.abs(beta*x - y).sum() < best_obj:
            best_obj = np.abs(beta*x - y).sum()
            best_beta = beta
    return best_obj, best_beta

def solve_bisect(x, y):
    it = 0
    u = 100
    l = -100
    while True:
        it +=1

        if it > 40:
            # maxIter reached
            return obj, beta, subgrad

        # select the mid-point
        beta = (l + u)/2

        # subgrad calculation. \partial |x*beta - y| = sign(x*beta - y) * x. np.abs(x * beta -y) > 1e-5 is to avoid numerical issue
        subgrad = (np.sign(x * beta - y) * (np.abs(x * beta - y) > 1e-5) * x).sum()
        obj = np.sum(np.abs(x * beta - y))
        print 'obj = %f, subgrad = %f, current beta = %f' % (obj, subgrad, beta)

        # bisect. check subgrad to decide which part of the space to cut out
        if np.abs(subgrad) <1e-3:
            return obj, beta, subgrad
        elif subgrad > 0:
            u = beta + 1e-3
        else:
            l = beta - 1e-3
brute_sol = find_best(x,y)
bisect_sol = solve_bisect(x,y)
print 'brute_sol: obj = %f, beta = %f' % (brute_sol[0], brute_sol[1])                                                                                                                                         
print 'bisect_sol: obj = %f, beta = %f, subgrad = %f' % (bisect_sol[0], bisect_sol[1], bisect_sol[2])

As you can see I also have a brute-force implementation that searches over the space to get the oracle answer (up to some numerical error). Every run can find the optimal best and minumum objective value. However, the subgrad is not 0 (not even close). For instance, one of my run I got the following:

brute_sol: obj = 10.974381, beta = -0.440700
bisect_sol: obj = 10.974374, beta = -0.440709, subgrad = 0.840753

Objective value and best are correct, but subgrad is not close to 0 at all. So the question:

  1. why the subgrad is not close to 0 ? Isn't the optimal condition being 0 is in the subdifferential if and only if it's optimal?
  2. What stopping criterion should we use instead?
1

There are 1 best solutions below

0
On
  1. I think it might be because the subdifferential of the absolute value $f(x)=|x|$ is a slightly offset Heaviside function $\partial_p f(x) = 2H_0(x)-1$. So it might be numerically difficult to get exactly zero. (See also here).

  2. You should look for a large jump and sign change in the subgradient instead, I think.