A minimization problem in function fitting setup

779 Views Asked by At

Let $\Omega$ be a convex, closed, compact set in $\mathbb{R}^d$ with a smooth boundary.

Given a data $(x_i,d_i)$, $x_i \in \Omega$,$d_i \in \mathbb{R}$, $i = 1,2,3...N$, $N>d$ and $\sum\limits_{i=1}^N d_i = 0$. Also given that, there are always $d$ vectors in $\{x_i\}$ which are linearly independent.

Let $A = \int_{\Omega}dx$

I want to find a continuous function $f:\Omega \to \mathbb{R}$, such that,

  1. $\int_{\Omega}f(x)dx = 0$ and
  2. $C(f)$ is minimum, where $$C(f) = \frac{A^{1/d}}{N}\bigg(\sum\limits_{i=1}^N |f(x_i)-d_i|^d\bigg)^{1/d} +\|f\|_{L^d}+ A^{1/d} \||\nabla f|\|_{L^d}$$

Does the solution exist? Is the solution unique? I mean any two solutions are equal almost everywhere?

2

There are 2 best solutions below

16
On BEST ANSWER

The quick answer to your question (for $d > 1$) is no, there is no continuous solution. This can be rectified, however, by using a different $L^p$ norm.

Starting with the case of the $L^d$ norm as asked in the question, where $d$ is also the dimension of the Euclidean space, consider the following. For any $x_0\in\Omega$ and real numbers $r_0,A > 0$, define $f\colon\Omega\to\mathbb R$ by $$ f(x) = \begin{cases} 0,&{\rm if\ }r\ge r_0,\\ 1,&{\rm if\ }r\le e^{-A}r_0,\\ A^{-1}\log(r_0/r)&{\rm if\ }e^{-A}r_0 < r < r_0 \end{cases} $$ where $r=\lVert x-x_0\rVert_2$. The norm $\lVert f\rVert_{L^d}$ is bounded by $cr_0$ for constant $c$, so can be made small by taking $r_0$ to be small. Also, $\nabla f$ is of magnitude $1/(Ar)$ for $e^{-A}r_0 < r < r_0$. So (for some other constant $c$), \begin{align} \lVert\nabla f\rVert_{L^d}^d&=c\int_{e^{-A}r_0}^{r_0}(Ar)^{-d}\,(r^{d-1}dr)\\ &=cA^{-d}\left[\log r\right]_{e^{-A}r_0}^{r_0}=cA^{-(d-1)} \end{align} So, by taking $A$ large and $r_0$ small, we can make $\lVert f\rVert_{L^d}$ and $\lVert\nabla f\rVert_{L^d}$ as small as we like, with $f(x_0)=1$. By taking linear combinations of such functions (with $x_0$ replaced by the $x_i$ in the question, $C(f)$ can be made arbitrarily small. Clearly, the limit $C(f)=0$ is not obtained by a continuous function.

Instead, consider using the following for $C(f)$, $$ C(f)=a\left(\sum_{i=1}^N\lvert f(x_i)-d_i\rvert^p\right)^{1/p}+b\lVert f\rVert_{L^p}+c\lVert\nabla f\rVert_{L^p} \tag1 $$ for any positive constants $a,b,c$ and $p > d$. The Sobolev embedding theorem and Morrey's inequality bounds the $C^{0,\alpha}$-norm of $f$ by a positive multiple of $C(f)$, with $\alpha=1-\frac dp$. This guarantees that the limit of any sequence of continuous $f_n\colon\Omega\to\mathbb R$ with $C(f_n)$ bounded is, at the very least, $\alpha$-Hölder continuous. Any sequence $f_n$ with $C(f_n)$ converging to $\inf\{C(f)\colon \int f=0\}$ satisfies $$ a\left(\sum_{i=1}^N\lvert f_n(x_i)-f(x_i)\rvert^p\right)^{1/p}+b\lVert f_n-f\rVert_{L^p}+c\lVert\nabla(f_n-f)\rVert_{L^p}\to0 \tag2 $$ where $f\colon\Omega\to\mathbb R$ is the unique continuous function minimising $C(f)$. The convergence of $f_n$ and the existence and uniqueness of the limit $f$ follows from uniform convexity of the $L^p$ norm for $1 < p < \infty$.

14
On

There will not be a continuous optimum, at least for $d = 1$. Assume that we have a solution $g$ --- we will see how to improve upon it. Write $C = C_1 + C_2 + C_3$ for the three terms of the functional.

Let $i$ be such that sign$(g(x_i)) \neq$ sign$(g(x_{i+1}))$. For concreteness, we assume that $g(x_i) > 0$, and that $\int_a^bgdx \geq 0$ (the other cases can be handled similarly). Let $(a,b) = (x_i,x_{i+1})$.

What does $g$ look like on this interval? Typically, it will be decreasing from $g(a)$ to $g(b)$, in which case $C_3(g) = g(a) - g(b)$. However, it could be that $\bar{g} := \max_{[a,b]}g(x) > g(a)$, in order to make $\int_a^bfdx$ sufficiently large (to help make $\int_\Omega f dx = 0$). In this case we will have $C_3(g) = (\bar{g} - g(a)) + (\bar{g} - g(b))$. Note that we will never have $\min_{[a,b]}g(x) < g(b)$, or $g$ could be improved.

We let $f = g$ everywhere except on $(a,b)$, so $C_1(f) = C_1(g)$. We let $k > 0$ be a large positive number, and construct $f$ on $(a,b)$ like this: i) grow linearly with rate $k$ until reaching $\bar{g}$, ii) stay constant for a time $h$, iii) decrease linearly with rate $k$ to 0, iv) stay 0, v) decrease linearly with rate $k$ to $g(b)$ at the end of the interval. Given $k$, we can always choose $h$ so that $\int_a^b f dx = \int_a^b g dx$. (In the case where $\bar{g} \leq g(a)$ step i) becomes void, and in the case $\int_a^bgdx = 0$ step ii) becomes void.)

The construction obviously gives $C_3(f) = C_3(g)$. But as $k$ increases, $C_2(f) = \int_a^b |f| dx$ will decrease downwards towards $\int_a^b f dx$, which contradicts the optimality of $g$.

The "problem" is that $C(f)$ does not penalise higher derivatives, which the construction above exploits. From a statistical learning perspective, this is usually solved by restricting the minimization of $C$-like functionals to some class of functions (the simplest examples would be ridge or lasso regressions).