How to get the following concave surrogate function for this function of a vector?

231 Views Asked by At

I have read in a paper that for following function $$f(\textbf{x})=\frac{1}{a\|\textbf{x}\|^2+b}$$ we can have following concave surrogate function $$g(\textbf{x}|\textbf{x}_0)=-\frac{a\|\textbf{x}\|^2}{b^2}+2a\left(\frac{1}{b^2}-\frac{1}{(a\|\textbf{x}_0\|^2+b)^2}\right)\textbf{x}^T\textbf{x}_0+\frac{1}{(a\|\textbf{x}_0\|^2+b)}+\frac{2a\|\textbf{x}_0\|^2}{(a\|\textbf{x}_0\|^2+b)^2}-\frac{a\|\textbf{x}_0\|^2}{b^2}$$ where $a$ and $b$ are both positive constants. I will be very thankful if somebody could provide a method for obtaining the concave surrogate function. Any help in this regard will be much appreciated. Thanks in advance.

1

There are 1 best solutions below

12
On BEST ANSWER

A good reference for this is Kenneth Lange's book, MM Optimization Algorithms, SIAM, 2016.

The basic idea here is that we want a surrogate function $g(x|x_{0})$ with the properties:

  1. $f(x_{0})=g(x_{0} |x_{0})$

  2. $\nabla f(x_{0})= \nabla g(x_{0}|x_{0})$

  3. $g(x|x_{0}) \leq f(x)$ for all $x$

  4. $g(x|x_{0})$ is separable (to make maximizing $g$ easy!)

Proof.

  1. It's easy to see that the candidate function given in the problem statement has $g(x_{0}|x_{0})=f(x_{0})$.

  2. Since

$\nabla f(x)=\frac{-2a}{(a\|x\|^{2}+b)^{2}}x$,

we have

$\nabla f(x_{0})=\frac{-2a}{(a\|x_{0}\|^{2}+b)^{2}}x_{0}$

and it is easy to check that $\nabla g(x_{0}|x_{0})=\nabla f(x_{0})$.

  1. Since $g(x|x_{0})$ matches $f(x_{0})$ and $\nabla f(x_{0})$ we can expand in a quadratic Taylor series around $x_{0}$, The constant and linear terms will match, and the quadratic term in $g(x|x_{0})$ will be smaller than the quadratic term in the Taylor series for $f(x)$ if

$\nabla^{2} g(x|x_{0}) \preceq \nabla^{2} f(x)$ for all $x$.

It's easy to show that

$\nabla^{2} f(x)=\frac{-2a}{(a\| x \|^{2}+b)^{2}}I + \frac{8a^{2}}{(a\|x \|^{2}+b)^{3}}xx^{T}$

while

$\nabla^{2} g(x|x_{0})=\frac{-2a}{b^{2}}I$.

Thus the eigenvalues of $\nabla^{2}g(x|x_{0})$ are all smaller than the eigenvalues of $\nabla^{2} f(x)$.

We can write a Taylor series with remainder term for $f(x)-g(x|x_{0})$ centered at $x_{0}$ as

$f(x)-g(x|x_{0})= (f(x_{0}-g(x_{0}|x_{0})) + (\nabla f(x_{0}) - \nabla g(x_{0}|x_{0}))^{T}(x-x_{0})+ \frac{1}{2} (x-x_{0})^{T} \left( \nabla^{2}f(\xi)-\nabla^{2}g(\xi |x_{0}) \right) (x-x_{0})$

for some $\xi$ on the line segment between $x$ and $x_{0}$.

$f(x)-g(x|x_{0})=0 + 0 + \frac{1}{2} (x-x_{0})^{T} \left( \nabla^{2}f(\xi)-\nabla^{2}g(\xi |x_{0}) \right) (x-x_{0})$.

Since $\nabla^{2} f(\xi)- \nabla^{2} g(\xi | x_{0})$ is a positive semidefinite matrix,

$\frac{1}{2} (x-x_{0})^{T} \left( \nabla^{2}f(\xi)-\nabla^{2}g(\xi |x_{0}) \right) (x-x_{0}) \geq 0$

and

$f(x)-g(x|x_{0}) \geq 0$ for all $x$.

  1. $g(x|x_{0})$ is clearly separable.

This general approach to finding a separable concave underestimating surrogate is used frequently in EM and MM methods.

P.S. I've been asked how you could derive this from scratch. The general process would be:

  1. Find a lower bound $p$ on the eigenvalues of $\nabla^{2}f(x)$. Then we can set the quadratic term in $g(x|x_{0})$ to

$\frac{p}{2} \| x\|^{2}$.

This ensures that

$\nabla^{2} g(x|x_{0})=pI \preceq \nabla^{2}f(x)$.

  1. Let the linear term in $g(x|x_{0})$ be $c^{T}x$. Then

$\nabla g(x|x_{0})=px+c$

Solve for $c$ by setting $\nabla g(x_{0}|x_{0})=\nabla f(x_{0})$ to get

$px_{0} + c=\nabla f(x_{0})$

$c=\nabla f(x_{0})-px_{0}$.

  1. Let the constant term in $g(x|x_{0})$ be $d$. Solve for $d$ by setting

$g(x_{0}|x_{0})=\frac{p}{2}\|x_{0}\|^{2}+c^{T}x_{0}+d=f(x_{0})$.

$d=f(x_{0})- \frac{p}{2}\|x_{0}\|^{2} -c^{T}x_{0}$.