I debated whether to put this in SO or Mathematics but the problem I suspect is more a mathematical question rather than a programming one.
I have a simple objective function:
$$f(x) = |x_0| + |x_1| $$
I have defined the respective partial derivative at $x_i=0$ to be zero, as in the following code:
def f(x):
return abs(x[0]) + abs(x[1])
def jac(x):
# consider x[0]: if +'ve deriv is +'ve, if -'ve deriv is -'ve, zero otherwise...
d_0 = -1 * (x[0] < 0) + 1 * (x[0] > 0)
# consider x[1]: same deal..
d_1 = -1 * (x[1] < 0) + 1 * (x[1] > 0)
return np.array([d_0, d_1])
I have implemented the BFGS algorithm to solve this problem:
from scipy.optimize import minimize
soln = minimize(fun=f, x0=np.array([5, 3]), method='bfgs', jac=jac)
The result I obtain is:
status: 2
success: False
njev: 38
nfev: 46
hess_inv: array([[ 1.22222222, 1.88888889],
[ 1.88888889, 7.13406795]])
fun: 1.3333333333333333
x: array([ 1.22222222, -0.11111111])
message: 'Desired error not necessarily achieved due to precision loss.'
jac: array([ 1, -1])
The reason it fails is due to the line search not returning a stepsize $\alpha$ through Wolfe condition search.
But I believe that a stepsize should exist that satisfies the Wolfe conditions.
Please correct me if I am wrong but my understanding is that $\alpha$ satisfies the Wolfe conditions if the following are true:
i) $f(x_k+\alpha_k p_k) \leq f(x_k) + c_1 \alpha_k \nabla f(x_k) \cdot p_k$
ii) $-\nabla f(x_k+\alpha_k p_k) \cdot p_k \leq -c_2 \nabla f(x_k) \cdot p_k$
At the point of failure of the scipy algorithm the following is calculated:
$x_k = [1.222, -0.111]$
$f(x_k) = 1.333$
$p_k = [0.666, 5.245]$
$\nabla f(x_k)=[1,-1]$
$\nabla f(x_k) \cdot p_k = -4.579$
So if I use $\alpha =0.022$ (values close to which the algorithm does examine) I calculate:
$x_{k+1} = [1.237, 0.004]$
$f(x_{k+1}) = 1.241$
$\nabla f(x_{k+1}) = [1,1]$
$\nabla f(x_{k+1}) \cdot p_k = 5.911$
Thus the Wolfe conditions are satisfied:
i) $1.241 \leq 1.333 + 0.0001 * 0.022 * -4.579$
ii) $-5.911 \leq - 0.9000 * -4.579$
So what is my misunderstanding or what is minpack2.dcsrch doing during the Wolfe search that I don't understand.