Is Frank Wolfe a descent algorithm?

629 Views Asked by Bumbble Comm At 08 Apr 2026 - 12:20

A colleague was explaining to me that the Frank-Wolfe algorithm is a descent algorithm (i.e. its objective value decreases monotonically at each iteration). However, when I tried simulating it, my curve is not monotonically decrease, but does converge. It's possible I'm just a bad coder, but can someone point me to a proof somewhere that shows Frank-Wolfe is a descent method?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 02 Jul 2016 - 11:54 BEST ANSWER

I think it depends on how you choose $\gamma$. If you choose it using line search, then fine (see this proof), otherwise I think in general, it does not. Consider the case $f(x) = x^2$ with $D = [-1,1]$. Starting from $x_0=1/2$ and using the rule $\gamma = \frac{2}{k+2}$ (fron the wikipedia page you linked), we have: $$ \begin{array}{rcccc} k: & 0 & 1 & 2 & \dots \\ x_k: & 1/2 & -1 & 1/3 & \dots \\ f(x_k): & \mathbf{1/4} & \mathbf{1} & \mathbf{1/9} & \dots \\ s\,\nabla f(x_k): & s & -2 \, s & 2/3 \, s & \dots \\ s_k: & -1 & 1 & -1 & \dots \\ \gamma_k: & 1 & 2/3 & 1/2 & \dots \\ \end{array} $$ As you can see the value of the cost function starts from $1/4$, then goes up to $1$ before decreasing to $1/9$.

Your proof seems to suggest that the algorithm is a descent algorithm for an arbitrary choice of $\gamma$. I think there is an error is the step:

Since $s^*$ is the solution of a convex optimization problem, we have $\nabla f(s)^T(x-s)\geq 0, \forall x\in D$.

In fact, this would hold if the function being minimized was $f(s)$ while $s^*$ is the solution of the tangent problem "$\operatorname{minimize} \nabla f(x_k)^T s$".

Bumbble Comm On 30 Jun 2016 - 8:27

Edit: totally found an error in my own proof. ignore this please.

Nevermind. I managed to prove it. Since I had some difficulty finding the proof, I'll just post the answer here.

The Frank-Wolfe algorithm solves

min $f(x)$ subject to $x \in D$

using the iteration

$x^+ = (1-\gamma) x + \gamma s^*$

for some $\gamma \in (0,1)$, and $s^* = \arg\min_s \langle s, \nabla f(x) \rangle$.

From convexity we have

$f(x^+) \leq (1-\gamma) f(x) + \gamma f(s^*)$.

Since $s^*$ is the solution of a convex optimization problem, we have

$\nabla f(s)^T(x-s)\geq 0, \forall x\in D$.

[Edit: the optimality condition should be $\nabla f(x)^T(x-s)\geq 0, \forall x\in D$, not what's written. So this proof is incorrect.]

Finally, by first order condition, we have $f(s^*) - f(x) \leq \nabla f(s)^T(s-x)$.

Mix and match and we get $f(x^+) - f(x) \leq 0$.

Is Frank Wolfe a descent algorithm?

There are 2 best solutions below

Related Questions in OPTIMIZATION

Related Questions in ALGORITHMS

Related Questions in CONVEX-OPTIMIZATION

Related Questions in NONLINEAR-OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions