What does mean in calculus of variation?

263 Views Asked by At

In a calculus of variation problem, let functional $J = \ \int_{x_1}^{x_2} F(x, y, y') \,dx $. When $x_1$ and $x_2$ are all fixed, the variation of J are defined by Gâteaux variation:

$\delta J(y, h) = \frac{dJ(y+\epsilon h)}{d\epsilon}\vert{_{\epsilon = 0}}$, here $h=h(x)$ is any arbitrary function and $\epsilon$ is a scalar. Then the Euler-Lagrange equation can be derived by making $\delta J(y, h) = 0$.

When $x_1$ and $x_2$ are NOT fixed, the general variation of $J$ is derived by Gelfand Fomain etc. as follows:

Derivation of general variation of a functional

Derivation of general variation of a functional continue

My question are:

  1. Can we derive the general variation of a functional ($x_1$ and $x_2$ are NOT fixed) by using Gâteaux variation? So that we may express $\delta J$ again as a limit or derivative of $\epsilon$?
  2. When $x$ is the independent variable, are $d x$ and $\delta x$ different?
  3. When $y = y(x)$, what is $\delta y$? Is it true that $\delta y = \epsilon h(x)$? Here $\epsilon$ is an infinitesimal scalar and $h(x)$ is any admissible arbitrary function.
  4. Is it true that $\delta$ operates just like the differential $d$ in deriving equations? What are the differences?

I tried to derive $\delta J$ by making J a function of four independent scalar variables, which are $\epsilon_1$, $\epsilon_2$, $\epsilon_3$ and $\epsilon_4$.The meaning of these variables can be seen in the following figure: Construct of a smooth function <span class=$y(x)$" />

In this figure, it is assumed that $x_0$, $x_1$ and curve $y^{\ast}(x)$ are known coordinates and curve that minimizes functional J. $a$, $b$, $c$ and $d$ are arbitrary constants. Curve $y(x)$ is defined as:

$y(x) = y^{\ast}(x) + ex + f$. Here $x_0+\epsilon_1a \leq x \leq x_1+\epsilon_3c$. Correspondingly, $y_0+\epsilon_2b \leq y \leq y_1+\epsilon_4d$.

In order to make $y(x) = y^{\ast}(x)$ when $\epsilon_1 = 0$, $\epsilon_2 = 0$, $\epsilon_3 = 0$ and $\epsilon_4 = 0$, it is required that

$e = \frac{[y_1 + \epsilon_4d - y^{\ast}(x_1 + \epsilon_3c)]-[y_0 + \epsilon_2b - y^{\ast}(x_0 + \epsilon_1a)]}{(x_1+\epsilon_3c)-(x_0+\epsilon_1a)}$, and

$f = [y_1 + \epsilon_4d - y^{\ast}(x_1 + \epsilon_3c)] - \frac{[y_1 + \epsilon_4d - y^{\ast}(x_1 + \epsilon_3c)]-[y_0 + \epsilon_2b - y^{\ast}(x_0 + \epsilon_1a)]}{(x_1+\epsilon_3c)-(x_0+\epsilon_1a)} (x_1 + \epsilon_3c)$.

Curve $y(x)$ represents a type of 'general' curve that is different from the optimal curve $y^{\ast}(x)$ in terms of its magnitude and two end points. Curve $g(x)$ in the figure can be ignored for now. The functional $J$ is thus:

$J = \ \int_{x_0+\epsilon_1a}^{x_1+\epsilon_3c} F(x, y, y') \,dx $, which is a function of $\epsilon_1$, $\epsilon_2$, $\epsilon_3$ and $\epsilon_4$. The differential of $J$ is:

$dJ = \frac{\partial J}{\partial \epsilon_1}d\epsilon_1 + \frac{\partial J}{\partial \epsilon_2}d\epsilon_2 + \frac{\partial J}{\partial \epsilon_3}d\epsilon_3+ \frac{\partial J}{\partial \epsilon_4}d\epsilon_4$.

Since the lower and upper limits of integration are not constant, the Leibniz integral rule for differentiation under the integral sign has to be used:

$\frac{\partial J}{\partial \epsilon_i} = F(x_1+\epsilon_3c, y, y^{'})\frac{d(x_1+\epsilon_3c)}{d\epsilon_i} - F(x_0+\epsilon_1a, y, y^{'})\frac{d(x_0+\epsilon_1a)}{d\epsilon_i} + \ \int_{x_0+\epsilon_1a}^{x_1+\epsilon_3c} \frac{\partial F(x, y, y')}{\partial \epsilon_i} \,dx $, $i = 1, 2, 3, 4$.

$\frac{\partial F(x, y, y')}{\partial \epsilon_i} = \frac{\partial F(x, y, y')}{\partial y} \frac{\partial y}{\partial \epsilon_i} + \frac{\partial F(x, y, y')}{\partial y^{'}} \frac{\partial y^{'}}{\partial \epsilon_i}$,

$\frac{\partial y}{\partial \epsilon_i} = \frac{\partial e}{\partial \epsilon_i}x + \frac{\partial f}{\partial \epsilon_i}$,

$\frac{\partial y^{'}}{\partial \epsilon_i} = \frac{\partial e}{\partial \epsilon_i}$. Then

$\int_{x_0+\epsilon_1a}^{x_1+\epsilon_3c} \frac{\partial F(x, y, y')}{\partial \epsilon_i} \,dx = \frac{\partial e}{\partial \epsilon_i} \int_{x_0+\epsilon_1a}^{x_1+\epsilon_3c} [\frac{\partial F(x, y, y')}{\partial y}x + \frac{\partial F(x, y, y')}{\partial y^{'}}]\,dx + \frac{\partial f}{\partial \epsilon_i} \int_{x_0+\epsilon_1a}^{x_1+\epsilon_3c} \frac{\partial F(x, y, y')}{\partial y} \,dx$.

Let's use $*$ to donate the optimality when $\epsilon_1 = 0$, $\epsilon_2 = 0$, $\epsilon_3 = 0$ and $\epsilon_4 = 0$. Then

$(dJ)^* = (\frac{\partial J}{\partial \epsilon_1})^*d\epsilon_1 + (\frac{\partial J}{\partial \epsilon_2})^*d\epsilon_2 + (\frac{\partial J}{\partial \epsilon_3})^*d\epsilon_3+ (\frac{\partial J}{\partial \epsilon_4})^*d\epsilon_4$,

$(\frac{\partial J}{\partial \epsilon_i})^* = F(x_1, y^*, y^{*'})\frac{d(x_1+\epsilon_3c)}{d\epsilon_i} - F(x_0, y^*, y^{*'})\frac{d(x_0+\epsilon_1a)}{d\epsilon_i} + \ \int_{x_0}^{x_1} \frac{\partial F(x, y^*, y^{*'})}{\partial \epsilon_i} \,dx $.

$\int_{x_0}^{x_1} \frac{\partial F(x, y^{*}, y^{*'})}{\partial \epsilon_i} \,dx = (\frac{\partial e}{\partial \epsilon_i})^* \int_{x_0}^{x_1} [\frac{\partial F(x, y^*, y^{*'})}{\partial y^*}x + \frac{\partial F(x, y^*, y^{*'})}{\partial y^{*'}}]\,dx + (\frac{\partial f}{\partial \epsilon_i})^* \int_{x_0}^{x_1} \frac{\partial F(x, y^*, y^{*'})}{\partial y^*} \,dx$.

At optimality, curve $y^*(x)$ satisfy the Euler equation:

$\frac{\partial F(x, y^*, y^{*'})}{\partial y^*} = \frac{d}{dx} \frac{\partial F(x, y^*, y^{*'})}{\partial y^{*'}}$. Substitute this into the previous equation and integrate by parts, it can be derived that:

$\int_{x_0}^{x_1} \frac{\partial F(x, y^{*}, y^{*'})}{\partial \epsilon_i} \,dx = (\frac{\partial e}{\partial \epsilon_i})^* [\frac{\partial F(x, y^*, y^{*'})}{\partial y^{*'}} x ]|_{x=x_0}^{x=x_1} + (\frac{\partial f}{\partial \epsilon_i})^* [\frac{\partial F(x, y^*, y^{*'})}{\partial y^{*'}} ]|_{x=x_0}^{x=x_1} $.

It can be derived that:

$(\frac{\partial e}{\partial \epsilon_1})^{*} = \frac{a y^{*'}(x_0)}{x_1-x_0}$,

$(\frac{\partial e}{\partial \epsilon_2})^{*} = \frac{-b}{x_1-x_0}$,

$(\frac{\partial e}{\partial \epsilon_3})^{*} = \frac{-c y^{*'}(x_0)}{x_1-x_0}$,

$(\frac{\partial e}{\partial \epsilon_4})^{*} = \frac{d}{x_1-x_0}$,

$(\frac{\partial f}{\partial \epsilon_1})^{*} = -y^{*'}(x_0) \frac{a x_1}{x_1-x_0}$,

$(\frac{\partial f}{\partial \epsilon_2})^{*} = \frac{b x_1}{x_1-x_0}$,

$(\frac{\partial f}{\partial \epsilon_3})^{*} = y^{*'}(x_1) \frac{c x_0}{x_1-x_0}$,

$(\frac{\partial f}{\partial \epsilon_4})^{*} = \frac{-d x_0}{x_1-x_0}$.

Substitute those equations above to $(dJ)^*$:

$(dJ)^* = d\cdot(F_{y^{*'}})_{x=x_1} d\epsilon_4 - b\cdot(F_{y^{*'}})_{x=x_0} d\epsilon_2 + c\cdot[F-(F_{y^{*'}})y^{*'}]_{x=x_1}d\epsilon_3 - a\cdot[F-(F_{y^{*'}})y^{*'}]_{x=x_0}d\epsilon_1$.

Is the above derivation correct? If so, compared with Eq (5) from Fomin:

Eq(5), Fomin

Does this mean that, when evaluate the variation of $J$ based on the optimal curve:

$(\delta J)^* = (dJ)^*$?

Moreover, in the above derivation, $(dJ)^*$ is evaluated at the optimal coordinates $x_0$, $x_1$ and the optimal curve $y^*(x)$. But Eq (5) of Fomin seems to be more general, meaning that Eq (5) can be used for the curve which is not optimal. How can I derive $dJ$ which is evaluated on any curve $y(x)$?

1

There are 1 best solutions below

10
On
  1. We have a function of three variables — one function $y$ and two real values $x_0,x_1$: $$ J(y; x_0, x_1) = \int_{x_0}^{x_1} F(x, y(x), y'(x)) \, dx. $$ When all the variables are varied we have $$ \delta J = \left< \frac{\delta J}{\delta y}, \delta y \right> + \frac{\partial J}{\partial x_0} \, dx_0 + \frac{\partial J}{\partial x_1} \, dx_1 . $$ Here, $$\begin{align} \left< \frac{\delta J}{\delta y}, \delta y \right> &= \int_{x_0}^{x_1} \left( \frac{\partial F}{\partial y(x)} \, \delta y(x) + \frac{\partial F}{\partial y'(x)} \, \delta y'(x) \right) \, dx \\ &= \left[ \frac{\partial F}{\partial y'(x)} \, \delta y(x) \right]_{x_0}^{x_1} + \int_{x_0}^{x_1} \left( \frac{\partial F}{\partial y(x)} \, \delta y(x) - \frac{d}{dx} \left(\frac{\partial F}{\partial y'(x)}\right) \, \delta y(x) \right) \, dx \\ &= \left. \frac{\partial F}{\partial y'(x)} \right|_{x=x_1} \, \delta y_1 - \left. \frac{\partial F}{\partial y'(x)} \right|_{x=x_0} \, \delta y_0 + \int_{x_0}^{x_1} \left( \frac{\partial F}{\partial y(x)} - \frac{d}{dx} \left(\frac{\partial F}{\partial y'(x)}\right) \right) \, \delta y(x) \, dx \\ \frac{\partial J}{\partial x_0} \, dx_0 &= -F(x_0, y(x_0), y'(x_0)) \, dx_0 \\ \frac{\partial J}{\partial x_1} \, dx_1 &= F(x_1, y(x_1), y'(x_1)) \, dx_1 \\ \end{align}$$

  2. $d$ and $\delta$ are just different notations for what are basically the same thing. The big difference is that $\delta$ is used when a function is changed, while $d$ is used in most other cases. The difference is more pragmatic than semantic.

  3. If you think of dy as a small variation of a scalar value, then you can think of $\delta y$ as a small variation of a function. One might say very imprecisely that $(\delta y)(x) = d(y(x)).$ When the variation is done by replacing $y(x)$ with $y(x)+\epsilon h(x),$ one can say that $\delta y=\epsilon h$ i.e. $\delta y(x) = \epsilon h(x).$

  4. See point 2.