I've been researching line search algorithms recently and there's one detail that is not entirely clear to me.
Is there any (practical) difference between Wolfe and Wolfe-Powell stopping conditions?
The mathematical formulation and the constants differ slightly.
The strong Wolfe conditions as described in Nocedal & Wright are basically:
Sufficient decrease (Armijio rule):
$f({\mathbf {x}}_{k}+\alpha _{k}{\mathbf {p}}_{k})\leq f({\mathbf {x}}_{k})+c_{1}\alpha _{k}{\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k})$ with $c_{1}=10^{{-4}}$
Curvature
${{\big |}{\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k}+\alpha _{k}{\mathbf {p}}_{k}){\big |}\leq c_{2}{\big |}{\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k}){\big |}}$ with $c_{2}=0.1$ for non linear CG methods
Now in other texts as this one (§2.3) the strong Wolfe-Powell conditions are defined as:
Sufficient decrease (Armijio rule):
{$f({\mathbf {x}}_{k}+\alpha _{k}{\mathbf {p}}_{k})\leq f({\mathbf {x}}_{k})+\sigma\alpha _{k}{\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k})$ with $\sigma\in(0,0.5)$
Curvature
${{\big |}{\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k}+\alpha _{k}{\mathbf {p}}_{k}){\big |}\leq -\rho {\mathbf {p}}_{k}^{{{\mathrm T}}}\nabla f({\mathbf {x}}_{k})}$ with $\rho\in [\sigma, 1) $
I've read that the Wolfe-Powell condition should be preferable for conjugate gradients methods that compute the direction ${\mathbf {p}}_{k}$ with the Polak-Ribiere formula.
Is this the actual case? I couldn't find anything on this specific topic. Any insight is much appreciated!
2026-03-29 21:34:20.1774820060