Notational confusion in derivation of Euler-Lagrange equations

105 Views Asked by At

I'm reading Bishop's "Pattern Recognition and Machine Learning" section on the Calculus of Variations (Appendix D) and he defines the functional derivative of $\frac{\delta F}{\delta y(x)}$ as:

$$ F[y(x) + \epsilon \eta(x)] = F[y(x)] + \epsilon \int\frac{\delta F}{\delta y(x)}\eta(x)dx + O(\epsilon^2) $$

Then, for $F[y] = \int G(y(x), y'(x), x)dx$ we get $$ F[y(x) + \epsilon \eta(x)] = F[y(x)] + \epsilon \int \left[ \frac{\partial G}{\partial y} \eta(x) + \frac{\partial G}{\partial y'} \eta'(x)\right ]dx + O(\epsilon^2) $$

How does one arrive at this expression? Most derivations of the Euler-Lagrange equation I've seen use the total derivative $\frac{\delta F}{\delta \epsilon}$. I'm unsure of the connection between these two notations.

1

There are 1 best solutions below

0
On

A multivariable function $z=f(x,y)$ is differentiable if

$\Delta z = \frac{\partial f}{\partial x}\Delta x + \frac{\partial f}{\partial y} \Delta y + \epsilon_1 \Delta x + \epsilon_2 \Delta y \quad \quad \text{Equation 1}$

where $\epsilon_1 \rightarrow 0$, $\epsilon_2 \rightarrow 0$ as $\Delta x \rightarrow 0$ and $\Delta y \rightarrow 0$.

To understand the definition of differentiability for multivariable functions it helps to have in mind the definition of the tangent plane to a function $z=f(x,y)$:

$\Delta z = \frac{\partial f}{\partial x}\Delta x + \frac{\partial f}{\partial y} \Delta y$

For small $\Delta x$ and $\Delta y$ we have $\epsilon_1$ and $\epsilon_2$ are small and the function can be approximated well be its tangent plane.

If you use the definition of the functional given by $D.5$:

$F[y(x)]=\int G(y(x),y'(x),x) dx$

and replace $y(x)$ by $y+\epsilon \eta (x)$ you get:

$F[y(x)+\epsilon \eta(x)]=\int G(y(x)+\epsilon \eta(x), y'(x)+\epsilon \eta'(x), x) dx$

$=F[y(x)]+\int G(y(x)+\epsilon \eta(x), y'(x)+\epsilon \eta'(x), x) - G(y(x),y'(x),x)dx \quad \quad \text{Equation 2}$

If you apply the definition of differentiability to the function G which is a function of three variables you get:

$\Delta G=\frac{\partial G}{\partial y} \Delta y+\frac{\partial G}{\partial y'} \Delta y' + \frac{\partial G}{\partial x}\Delta x + \epsilon_1 \Delta y+\epsilon_2 \Delta y' + \epsilon_3 \Delta x \quad \quad \text{Equation 3}$

Now consider the change in G in equation 2 inside the integral.

This corresponds to a change in G with $\Delta y=\epsilon \eta (x)$, $\Delta y' = \epsilon \eta'(x)$ and $\Delta x = 0$.

Therefore using equation 3 we have:

$\Delta G = \frac{\partial G}{\partial y} \epsilon \eta(x)+\frac{\partial G}{\partial y'} \epsilon \eta' (x) + \epsilon_1 \epsilon \eta (x) + \epsilon_2 \epsilon \eta' (x)$

and substituting this back into Equation 2 we get:

$F[y(x)+\epsilon \eta(x)]=F[y(x)]+\int \frac{\partial G}{\partial y} \epsilon \eta(x)+\frac{\partial G}{\partial y'} \epsilon \eta' (x)dx + \epsilon_1 \epsilon \int \eta (x) dx + \epsilon_2 \epsilon \int \eta' (x) dx $

$=F[y(x)]+\int \frac{\partial G}{\partial y} \epsilon \eta(x)+\frac{\partial G}{\partial y'} \epsilon \eta' (x)dx + O(\epsilon_1 \epsilon) + O(\epsilon_2 \epsilon)$

and if $\epsilon_1$, $\epsilon$ and $\epsilon_2$ are all the same magnitude then you get D.6:

$\Delta G=F[y(x)]+\int \frac{\partial G}{\partial y} \epsilon \eta(x)+\frac{\partial G}{\partial y'} \epsilon \eta' (x)dx + O( \epsilon^2)$