What is an inverse of a gradient? (doubts about mirror descent algorithm)

1.6k Views Asked by At

I'm trying to compute mirror descent algorithm (MDA) now, and I have a little complication in computing its update rule.

I would like to minimize something in an $l^1$-ball and for this, I took $\Phi (x) = \frac{1}{2(p-1)} ||x||_{p}^2$.

By some algebraic procedure, I've obtained each component of gradient $\nabla \Phi(x)_i = \frac{\partial \Phi}{\partial x_i} = \frac{1}{p-1} ||x||_{p}^{2-p} |x_i|^{p-2} x_i$.

My problem is the following:

Consider the following update rule of mirror descent algorithm:

$x_{t+1} = (\nabla \Phi)^{-1} (\nabla \Phi (x^t) - a_t \nabla f (x_t)$

Given $\nabla \Phi$ calculated previously, how can I obtain $\nabla \Phi^{-1}$ of mentioned update rule? I know that $\nabla \Phi$ takes and element of $\mathbb{R}^d$ and returns an vector again. So, what would be a inverse operator $(\nabla \Phi)^{-1}$?

I think that it should have some relation with some operator that does $\nabla \Phi ^{-1} (\nabla \Phi (x))= x$.

In summary, I would like to know what exactly is $\nabla \Phi ^{-1}$ which appears in mirror descent algorithm's update rule.

1

There are 1 best solutions below

3
On BEST ANSWER

The inverse operator for the gradient is the gradient of the convex conjugate ($\nabla \Phi^\star \equiv (\nabla \Phi)^{-1}$).

The convex conjugate is given by

$$\Phi^\star(y) = \sup_{x}\, (\langle x,y\rangle - \Phi(x)) $$

if you can come up with an explicit form of the convex conjugate in your case then taking the gradient has $\nabla \Phi^\star \equiv (\nabla \Phi)^{-1}$. (I'm shoving some technicalities under the rug, cf. Rockafellar, Convex Analysis for all the nice theory).

Indeed, let's say the sup is achieved at $x^\star$ so that $\Phi^\star(y)=\langle x^\star,y\rangle - \Phi(x^\star)$ then the first order optimality condition gives

$$ y = \nabla \Phi(x^\star) $$

showing that $(\nabla \Phi)^{-1}(y) = x^\star=\nabla \Phi^\star(y)$.

So another way to frame the problem is that you need to find the argument:

$$ (\nabla\Phi)^{-1}(y) = \arg\sup_x \,(\langle x,y\rangle - \Phi(x)) .$$

I'm not sure if in your case you can find an explicit form for this (though I haven't tried hard). However in a numerical setting you could solve this approximately which would lead to a two-step mechanism not unlike primal-dual solvers.