Can the calculus of variations be used to optimize to a constraint that the desired function must pass through specific intermediate points?

381 Views Asked by At

I am wondering about this. As one may know, the "calculus of variations" is a method to solve optimization problems in which the desired object to be optimized is a function, and the optimized quantity is some functional (a function that associates a number to other functions). The most basic type of problem is one in which the function in question is suitably smooth and of the form

$$f : [a, b] \rightarrow \mathbb{R}$$

and we seek the minimization of a functional $F$ typically given by an integral:

$$F[f] = \int_{a}^{b} L\left(f(t), f'(t), f''(t), \cdots, f^{(n)}(t)\right) dt$$

subject to the constraint that the endpoint values $f(a) = f_a$ and $f(b) = f_b$ are given. Through the methods of the calculus, one can convert such a problem into a differential-equations problem - the Euler-Lagrange equations.

However, what happens if we don't necessarily want the most minimal solution, but instead want to "peg" the function at one or more intermediate points in the domain interval? That is, we want to add one or more additional constraints

$$\begin{align}f(t_1) &= f_1\\ f(t_2) &= f_2\\ &\cdots\\ f(t_N) &= f_N\end{align}$$

where $a < t_1 < t_2 < \cdots < t_N < b$. Given this, can and, if so, how can, we modify the usual Euler-Lagrange method to solve it? As a simple example (and related to what I'm trying to do with this), suppose we set the domain interval as $[a, b] := [0, 2]$, and want to minimize

$$F[f] := \int_{0}^{2} [f''(t)]^2\ dt$$

subject to the three constraints

$$f(0) = 1,\ f(1) = 2,\ f(2) = 4$$

. Note that this function will not be the absolute minimum of the functional, or even that with the two endpoints alone fixed (such functions will be linear, and the points $(0, 1)$, $(1, 2)$ and $(2, 4)$ clearly do not lie along a straight line!), as in the usual kind of problems. Rather it is a minimum that only comes about by virtue of the additional constraint. How do we do it?

The simplest thought that I had would be to just try to think about it as two minimizations on $[0, 1]$ and $[1, 2]$, seeking two functions $f_{[0, 1]}$ and $f_{[1, 2]}$ so that

$$f(x) = \begin{cases} f_{[0, 1]}(x)\ \mbox{if $x \in [0, 1]$}\\ f_{[1, 2]}(x)\ \mbox{if $x \in [1, 2]$} \end{cases}$$

and $f_{[0, 1]}(1) = f_{[1, 2]}(1) = 2$ and trying to minimize the "sub"-functionals of the original with the expected changes of bounds on the defining integral. However, that will clearly not be guaranteed to minimize the original functional, because we can get a "corner" at $x = 1$, and it will fail to be differentiable there and so we can't even compute the original functional, much less minimize it under these constraints with that solution. What, then, do you need to avoid this?

2

There are 2 best solutions below

3
On

First of all we shall define the "family" of functions that you are considering.
That could be polynomials, trigonometric polynomials, etc.
or even a wider class of functions. Trigonometric polynomials could be interesting, because in the example you give of minimizing
$\int_{0}^{2} [f''(t)]^2\ dt$, the Parseval's theorem could simplify the computations.
However, if the points are not equally spaced, trigonometric polynomial are not easy to apply.

In general, given the values that the function has to take at $n$ points $t_1,t_2, \cdots, t_n$,
as a first step we can determine the $n-1$ degree (or lower) polynomial $p_{n-1}(t)$ passing through those points by, e.g. a Lagrange interpolation.
That polynomial will be the "simplest" analytic function passing through the given points.

Then the polynomial $$ q_{\,n} (t) = \left( {t - t_{\,1} } \right)\left( {t - t_{\,2} } \right) \cdots \left( {t - t_{\,n} } \right) $$ is a polynomial that has zeros in the given points.
Therefore if we put the function to be $$ f(t) = p_{\,n - 1} (t) + q_{\,n} (t)g(t) $$ where $g(t)$ is any function within the class of choice, and in particular analytic. We can optimize with respect to that without other particular restraints.

If we want to remain within the polynomials class, then we can choose to add some additional points $t_{n+1}, t_{n+2}, \cdots, t_{n+m}$, internal to range of the previous $n$ and or external, at which additional points we leave unspecified (variable parameters) the value of $f(t)$ . We then construct a Lagrange polynomial $p_{n+m-1}(t)$ which takes on the wanted values on the $n$ points, and the unspecifed parametric values on the other $m$ points, and which will be those subject to optimization.

That premised, we come to the example you give.

You want to minimize the square of the second derivative $$ I(f) = \int_0^2 {f''(t)^{\,2} dt} \quad \left| {\;f(0) = 1,\;\;f(1) = 2,\;\;f(2) = 4} \right. $$

a) No doubt that the widest general solution is given by a pair of straight lines, with a flat $I=0$ $$ \eqalign{ & f(x) = \cr & = \left( {1 + x} \right)\left[ {x < 1} \right] + \left( {2x} \right)\left[ {1 \le x} \right] = \cr & = \left( {1 + x} \right) + \left( {x - 1} \right)\left[ {1 \le x} \right] = \cr & = \left( {1 + x} \right) + \left( {x - 1} \right)H(x - 1) = \cr & = 2 + \left( {{3 \over 2} + {1 \over 2}{\mathop{\rm sgn}} (x - 1)} \right)\left( {x - 1} \right) \cr} $$ where
- $[P]$ denotes the Iverson bracket;
- $H(x)$ is the Heaviside's step function ($H(0)=1$);
- $sgn(x)$ is the sign function.

b) the rounded piecewise solution given by @Joriki has an $I=1.5$

c) We could make the function in a) above continuous, by converting the sign function into one of its continuous approximations, for instance $$ {\mathop{\rm sgn}} (x) \approx {x \over {\sqrt {x^{\,2} + \varepsilon ^{\,2} } }} + \left( {1 - {1 \over {\sqrt {1 + \varepsilon ^{\,2} } }}} \right)x $$ where the second term is to reconduct $sign(\pm 1)$ to exactly $\pm 1$ as to keep that function is passing through the assigned points.
The corresponding integral has a minimum for $\varepsilon \approx 0.817 $ giving $I \approx 1.514$.

d) If we want to remain within the polynomial ring then as in the premise let's put $$ \left\{ \matrix{ p_{\,2} (x) = \left( {x^{\,2} + x + 2} \right)/2 \hfill \cr q_{\,3} (x) = x\left( {x - 1} \right)\left( {x - 2} \right) \hfill \cr} \right. $$

Putting $g(x) \equiv 0$ we are going to have $$ f(x) = p_{\,2} (x)\quad \Rightarrow \quad f''(x) = 1\quad \Rightarrow \quad I = 2 $$

Putting instead $g(x)$ to be a first degree polynomial and minimizing we get $$ g(x) = - {5 \over {42}}\left( {x - 1} \right)\quad \Rightarrow \quad I = 32/21 \approx 1.52 $$ while putting it to be a 3rd degree we get $$ g(x) = {7 \over {170}}\left( {x - 3} \right)\left( {x - 1} \right)\left( {x + 1} \right)\quad \Rightarrow \quad I = 128/85 \approx 1.506 $$

0
On

To get a well-defined variational problem with Euler–Lagrange equations for a functional involving the second derivative, you need boundary values for the first derivative (see Wikipedia). For a single interval, if you don’t have any, you can treat them as free variables and minimize the value of the functional with respect to them.

In your case, the Euler–Lagrange equation is $f^{(4)}=0$, which is solved by all cubic polynomials. By minimizing the functional with respect to the derivatives at the boundary, you get the linear functions that you assumed to be the only solution.

For your two intervals with an intermediate point, you can minimize the function with respect to the first derivatives at all three points, but that no longer leads to a linear function because, as you note, that’s not possible.

The calculation is easiest if you write $f(x)$ in each interval as the linear function connecting the endpoints of the interval plus a linear combination of cubic polynomials that are zero at the endpoints and have derivative $1$ at one endpoint and $0$ at the other. The result is a cubic Hermite spline. If we denote the linear functions between the endpoints by $l_1$ and $l_2$, then on the first interval we have

\begin{eqnarray} f(x)&=&l_1(x)+c_0x(x-1)^2+c_1(x-1)x^2\;,\\ f''(x)&=&c_0(6x-4)+c_1(6x-2)\;, \end{eqnarray}

with coefficients $c_0$, $c_1$ to be determined. It’s convenient to use $y=x-1$ on the second interval; then on the second interval we have

\begin{eqnarray} f(y)&=&l_2(y)+c_2y(y-1)^2+c_3(y-1)y^2\;,\\ f''(y)&=&c_2(6y-4)+c_3(6y-2)\;. \end{eqnarray}

The condition for the continuity of the first derivative at the intermediate point is

$$ f(1)-f(0)+c_1=f(2)-f(1)+c_2\;,\tag1\label{constraint} $$

with the differences of function values coming from the linear functions and the coefficients coming from the Hermite interpolation polynomials corresponding to the intermediate point.

The value of the functional is

$$ \int_0^2(f''(x))^2\mathrm dx=4(c_0^2+c_0c_1+c_1^2+c_2^2+c_2c_3+c_3^2)\;. $$

Minimizing it with respect to the coefficients under the constraint \eqref{constraint} yields the equations

\begin{eqnarray} 2c_0+c_1&=&0\;,\\ c_0+2c_1&=&\lambda\;,\\ 2c_2+c_3&=&-\lambda\;,\\ c_2+2c_3&=&0\;. \end{eqnarray}

Solving the first and last equation and substituting into the second and third yields

$$ \pmatrix{c_0\\c_1\\c_2\\c_3}=\frac\lambda3\pmatrix{-1\\2\\-2\\1}\;. $$

Then the constraint \eqref{constraint} yields

$$ \lambda=\frac34(f(2)-2f(1)+f(0))=\frac34\;, $$

and thus

$$ \pmatrix{c_0\\c_1\\c_2\\c_3}=\frac14\pmatrix{-1\\2\\-2\\1}\;. $$

So the minimizing function is

\begin{eqnarray} f(x) &=& 1+x-\frac14x(x-1)^2+\frac12(x-1)x^2 \\ &=& \frac14\left(x^3+3x+4\right) \end{eqnarray}

on the first interval and

\begin{eqnarray} f(y) &=& 2+2y-\frac12y(y-1)^2+\frac14(y-1)y^2 \\ &=& \frac14\left(-y^3-3y^2-6y-8\right) \end{eqnarray}

on the second interval. Here's a plot.