Constrained Optimization / derivative on curves equal zero / convergence

55 Views Asked by At

Consider the minimization Problem

$$\min_{v\in\mathbb{R}^d\ :\ ||\boldsymbol{v}||=1} f(v)$$ where $f$ is smooth.

Let's define the sequence $\boldsymbol{v}_n$ by

$$\alpha_n=\operatorname{argmin}_{\alpha\in[0,1]}f(\sqrt{\alpha}\boldsymbol{v}_n+\sqrt{1-\alpha}\boldsymbol{d}_n)$$

$$\boldsymbol{v}_{n+1}=\sqrt{\alpha_n}\boldsymbol{v}_n+\sqrt{1-\alpha_n}\boldsymbol{d}_n$$

where $\boldsymbol{d}_n^T \boldsymbol{v}=0$ and $||\boldsymbol{d}_n||=1$, and $\boldsymbol{d}_n$ is a descent direction.

Suppose $\alpha \xrightarrow{}_{n\to \infty} 1$, $\boldsymbol{v}_n \to \boldsymbol{v}^*$. Can I conclude that $\boldsymbol{v}^*$ is stationary?

1

There are 1 best solutions below

5
On BEST ANSWER

If $d$ is an arbitrary descent direction, no (consider that you can choose $d$ arbitrarily close to orthogonal to $\nabla f$, so that each step is tiny and makes no progress to decrease $f$). But if $d$ is the gradient your approach works fine (you’re essentially performing a line search using the exponential map on the constraint manifold $S^{n-1}.$)

EDIT: Here is a program for how to prove the above formally, supposing $d_n$ is a unit vector with $d_n \cdot \frac{\nabla f}{\|\nabla f\|} > k$ for some bound $k$, where $\nabla f$ is the projection of the gradient of $f$ onto the constraint manifold (i.e. the unit sphere).

Assume, for contradiction, that $v^*$ is not a critical point. Define $g(t) = f\left[\exp_{v_n}(td_n)\right]$ and Taylor-expand it about $v^*$, and use uniform estimates on the magnitudes of the second and higher derivatives of $f$ to derive an expression like $$g(t) < f(v^*) + \nabla f(v^*) \cdot (v_n - v^*) + t\nabla f(v^*) \cdot d_n - M|v_n-v^* + td_n|^2.$$ Use $v_n\to v^*$ to derive a lower bound on $t$ away from zero (depending on $\|\nabla f(v^*)\|$ and $k$) for which it is guaranteed that $g(t) < f(v^*)$. Turn the bound on $t$ into a bound on $\alpha$ away from 1, contradicting $\alpha_n\to 1$.

The details seem very messy, but that's the essence of the argument. Note that being constrained to a sphere is largely a red herring here, since you can do all of the calculus on a local chart. Note also that you probably need regularity assumptions on $f$ beyond smoothness.