Convergence of regularized optimization problems to constraint optimization problem

69 Views Asked by At

I have seen many times the following fact, which seems to be folklore knowledge. Assume that you have an optimization problem of the form $$\min_{x \in \mathcal{X}} F(x) + \lambda G(x)$$ with $F$ and $G$ some strictly convex non-negative functionals such that $\min F = 0$ (the min is reached) and $\lambda > 0$. There exists a unique minimiser to this problem, denoted by $x_\lambda^*$. Then, $x_\lambda^*$ converges when $\lambda \rightarrow 0$ to the unique solution $x^*$ to $$\min_{x \in \mathcal{X}, \ F(x) = 0} G(x).$$ Is this result true, and has it been proven for some general context (eg infinite dimensional space $\mathcal{X}$) for some reasonable condition on $F$ and $G$.

A typical example is for $\mathcal{X}= \mathbb{R}^n$, $F(x) = \| A x - y \|^2_2$ with $y \in \mathbb{R}^m$ and $A \in \mathbb{R}^{mn}$ and $G(x)= \| x\|^2_2$. This corresponds to the Tychonov inverse problem $$\min_{x \in \mathbb{R}^n} \| y - A x \|^2_2 + \lambda \| x \|_2^2.$$

I am curious about a general theory for such problems.

Edit: The response to my question is that this is not the case in general, in particular for non-closed domains $\mathcal{X}$ (see a simple example in the comment and a detailed one responding to this post). However, the result is true over compact domains under reasonable assumptions (again, see the answer).

1

There are 1 best solutions below

3
On BEST ANSWER

In addition to my comments, here is a useful counter-example in which $x_{\lambda}^*$ is always defined, but $\lim_{\lambda\rightarrow 0^+}x_{\lambda}^* \neq x^*$.

Define the convex set $\mathcal{X}\subseteq\mathbb{R}^2$ by $$\mathcal{X}= \{(0,1/2)\} \cup (0,1]\times [1/2,1]$$ The set $\mathcal{X}$ is a rectangle with some of the border thrown away.

Define the strictly convex functions $F:\mathcal{X}\rightarrow\mathbb{R}$ and $G:\mathcal{X}\rightarrow\mathbb{R}$ by: \begin{align} F(x,y) &= x^2+\frac{x^2}{y} \quad \forall (x,y)\in \mathcal{X}\\ G(x,y) &=(y-1)^2+ (x-1)^2 \quad \forall (x,y)\in \mathcal{X} \end{align}

Observe that $F$ is nonnegative and $F(0,1/2)=0$, so $(0,1/2)$ is the unique minimizer of $F$. That is, $(0,1/2)$ is the unique element of the domain $\mathcal{X}$ that function $F$ maps to $0$.

For $\lambda \in (0, 1]$, the unique minimizer of $F(x,y)+\lambda G(x,y)$ over all $(x,y) \in \mathcal{X}$ can be shown to be $$(x_{\lambda}^*,y_{\lambda}^*)=\left( \frac{2\lambda}{4+2\lambda}, 1\right)$$ Then $$ \boxed{\lim_{\lambda\rightarrow 0^+}(x_{\lambda}^*,y_{\lambda}^*) = (0, 1) \neq (0,1/2)}$$


Strict convexity of $G$ is easy to verify. To show $F$ is strictly convex, let $(x,y)$ and $(a,b)$ be distinct elements of $\mathcal{X}$ and let $p \in (0,1)$. We have two cases:

  1. If $x=a$ then $x\neq 0$ and $y\neq b$ (since $(x,y)$ and $(a,b)$ are distinct, and there is only one point in $\mathcal{X}$ with $x$-coordinate $0$) and so

\begin{align} F(p(x,y)+(1-p)(a,b))&=F(x,py+(1-p)b) \\ &= x^2+\frac{x^2}{py+(1-p)b} \\ &<x^2+x^2\left(p\frac{1}{y}+(1-p)\frac{1}{b}\right)\\ &=pF(x,y)+(1-p)F(x,b)\\ &=pF(x,y)+(1-p)F(a,b) \end{align}

  1. If $x\neq a$ then we can define the convex function $h:\mathcal{X}\rightarrow\mathbb{R}$ by $h(x,y)=x^2/y$ and \begin{align} F(p(x,y)+(1-p)(a,b))&=(px+(1-p)a)^2 + h(p(x,y)+(1-p)(a,b))\\ &\leq (px+(1-p)a)^2 + ph(x,y) + (1-p)h(a,b)\\ &< px^2+(1-p)a^2 + ph(x,y) + (1-p)h(a,b)\\ &= pF(x,y)+(1-p)F(a,b) \end{align} $\Box$

On the positive side, fix $m$ as a positive integer and suppose:

  • $\mathcal{X}\subseteq\mathbb{R}^m$ is a compact set.

  • $F:\mathcal{X}\rightarrow\mathbb{R}$ is any continuous function that has a unique minimizer $x^*\in \mathcal{X}$.

  • $G:\mathcal{X}\rightarrow\mathbb{R}$ is any nonnegative and continuous function.

Claim: Let $x^*\in \mathcal{X}$ be the unique minimizer of $F$. Let $\{s_n\}_{n=1}^{\infty}$ be any sequence of nonnegative numbers that satisfy $\lim_{n\rightarrow\infty} s_n=0$. For each positive integer $n$ let $x_n^*\in \mathcal{X}$ be any minimizer of $F(x) + s_nG(x)$. Then $$ \lim_{n\rightarrow\infty} x_n^*=x^*$$

Proof: Fix $n$ as a positive integer. By definition of $x_n^*$ being a minimizer of $F+s_nG$ we have $$ F(x_n^*)+s_nG(x_n^*)\leq F(x^*)+s_nG(x^*)$$ Since $s_nG(x_n^*)\geq 0$ we have $$ F(x_n^*)\leq F(x^*) + s_nG(x^*) $$ Since $F(x^*)\leq F(x_n^*)$ we have $$ F(x^*)\leq F(x_n^*)\leq F(x^*) + s_nG(x^*) \quad \forall n \in \{1, 2, 3, ...\}$$ Taking $n\rightarrow\infty$ yields $$ \lim_{n\rightarrow\infty} F(x_n^*)=F(x^*) \quad (Eq. *)$$ Suppose $x_n^*$ does not converge to $x^*$ (we reach a contradiction). Then there is some $\epsilon>0$ for which $||x^*-x_n^*||\geq \epsilon$ infinitely often. So there is an infinite subsequence $\{x_{n[k]}^*\}_{k=1}^{\infty}$ such that $x_{n[k]}^*$ is in the compact set $\mathcal{X}\cap \{x:||x^*-x||\geq \epsilon\}$ for all $k$. It follows by the Bolzano-Wierstrass theorem that there exists a further infinite subsequence $\{x_{n[k_i]}\}_{i=1}^{\infty}$ that converges to a point $y\in \mathcal{X}\cap \{x:||x^*-x||\geq \epsilon\}$. By continuity of $F$ and (Eq *) we obtain $F(y)=F(x^*)$. Thus, $y$ is also a minimizer of $F$. Since $y$ is at least a distance $\epsilon$ away from $x^*$, we have $y\neq x^*$, which contradicts uniqueness of the minimizer of $F$. $\Box$

For the claim, the functions $F$ and $G$ are not required to be convex. However, the function $F$ satisfies the assumptions of the claim in the special case when $F$ is continuous and strictly convex, since any such function defined over a compact domain must have a unique minimizer.