I don't understand a part in the appendix (p.704-705) about the calculus of variations (which I was unfamiliar with before reading about it in the appendix) in the book "Pattern Recognition and Machine Learning" by Bishop. Since I'm not sure how much of the preceding parts of the appendix is necessary to understand the part I don't understand, I've included all the equations which I think might be of interest, and corrected according to the errata. I've highlighted the part I don't understand in bold.
Note:
$\eta(x)$ is an arbitrary function of $x$.
$\epsilon \eta(x)$ is a small change to the function $y(x)$.
We denote the functional derivative of $F[y]$ with respect to $y(x)$ by $\delta F/\delta y(x)$, and define it by the following relation:
$$F[y(x) + \epsilon \eta(x) ] = F[y(x)] + \epsilon \int \dfrac{\delta F}{\delta y(x)} \eta(x) dx + O(\epsilon^2). \quad (D.3)$$ Requiring that the functional be stationary with respect to small variations in the function $y(x)$ gives
$$\int \dfrac{\delta F}{\delta y(x)}\eta(x)dx = 0. \quad (D.4)$$ Because this must hold for an arbitrary choice of $\eta(x)$, it follows that the functional derivative must vanish. To see this, imagine choosing a perturbation $\eta(x)$ that is zero everywhere except in the neighbourhood of a point $\hat{x}$, in which case the functional derivative must be zero at $x=\hat{x}$. However, because this must be true for every choice of $\hat{x}$, the functional derivative must vanish for all values of $x$.
Consider a functional that is defined by an integral over a function $G(y, y', x)$ that depends on both $y(x)$ and its derivative $y'(x)$ as well as having a direct dependence on $x$ $$F[y] = \int G(y(x), y'(x), x) dx \quad (D.5)$$ where the value of $y(x)$ is assumed to be fixed at the boundary of the region of integration (which might be at infinity). If we now consider variations in the function $y(x)$, we obtain
$$F[y(x) + \epsilon \eta(x)] = F[y(x)] + \epsilon \int \left\{ \dfrac{\partial G}{\partial y} \eta(x) + \dfrac{\partial G}{\partial y'}\eta'(x) \right\} dx + O(\epsilon^2). \quad (D.6)$$ We now have to cast this in the form $(D.3)$. To do so, we integrate the second term by parts and make use of the fact that $\eta(x)$ must vanish at the boundary of the integral (because $y(x)$ is fixed at the boundary). This gives $$F[y(x) + \epsilon \eta (x)] = F[y(x)] + \epsilon \int \left\{ \dfrac{\partial G}{\partial y} - \dfrac{d}{dx}\left( \dfrac{\partial G}{\partial y'} \right) \right\}\eta(x) dx + O(\epsilon^2) \quad (D.7) $$
So from what I understand the reasoning goes like this:
Integrating $\dfrac{\partial G}{\partial y'}\eta'(x)$ by parts gives:
$$\int \left( \dfrac{\partial G}{\partial y'} \right) \eta'(x) dx = \left( \dfrac{\partial G}{\partial y'} \right) \eta(x) - \int \left(\dfrac{d}{dx} \left ( \dfrac{\partial G}{\partial y'} \right) \right)\eta(x) dx$$
By comparing $(D.7)$ to $(D.6)$, I get the understanding that
$$\left( \dfrac{\partial G}{\partial y'} \right) \eta(x) = 0 \quad (i)$$
and that this has something to do with $\eta(x)$ "vanishing at the boundary of the integral" due to $y(x)$ "being fixed at the boundary". Even if no explicit boundaries are given for the integral I guess the author implicitly assumes arbitrary boundaries for the integral. I wonder if my understanding is correct and in that case can someone elaborate on why the integral $(i)$ is equal to $0$?
All your integrals are evaluated on some interval, say $[a,b]$. When you perturb your function $y$ by $\varepsilon\eta(x)$ you assume that $\eta(a)=\eta(b)=0$ because you want to preserve the values of $y(a)$ and $y(b)$ (this is a particular type of variation). When you integrate by parts the non-integral term disappears because you have to evaluate it between the limits $a$ and $b$. When you combine your equations (D7) and (D6) you get the identity $$ \int\limits_a^b\left(\frac{\partial G}{\partial y}-\frac{d}{dx}\left(\frac{\partial G}{\partial y'}\right)\right)\eta(x)\,dx=0 $$ which must be true for any $\eta$ vanishing at $a$ and $b$. This is the key. From this you conclude that $$ \frac{\partial G}{\partial y}-\frac{d}{dx}\left(\frac{\partial G}{\partial y'}\right)=0 $$ which is the ODE that your extremal must satisfy.