Calculus of Variations (Pattern Recognition and Machine Learning)

343 Views Asked by Bumbble Comm At 29 Mar 2026 - 3:02

According to Paul Sinclair this answer is incorrect.

Can anyone explain how to use the calculus of variations to show that given

$$E[L]=\int \int (y(\textbf{x})-t)^2 p(\textbf{x},t) d\textbf{x} dt$$

we have

$$ \frac{\delta E[L]}{\delta y(\textbf{x})} = 2\int(y(\textbf{x})-t)p(\textbf{x},t)dt$$

Original Q&A

There are 1 best solutions below

Bumbble Comm On 11 Oct 2020 - 12:37 BEST ANSWER

Let $F(y) = \iint (y(x)-t)^2p(x,t)\,dxdt$. (I'm dropping the $\vec x$ notation for convenience. Just remember that $x$ is a vector, not a number).

$F$ is not a function from $\Bbb R \to \Bbb R$ that is being composed with $y$. Its definition requires the variable $y$ represent a function, not a number. Instead it is an operator on functions. So we cannot just blythely talk about "$\frac{\partial F}{\partial y}$". What does that even mean for operators like $F$? Note that the book does not use $d$ or $\partial$. Instead it uses $\delta$. In the calculus of variations, this indicates what is elsewhere called a "directional derivative".

$\frac{\delta F}{\delta y}$ when taken at a specific function $y$ is not a single number, but instead a linear operator indicating how $F$ changes when one leaves $y$ in the direction of various other functions. If $h(x)$ is an arbitrary function, then

$$\frac{\delta F}{\delta y}(h) := \lim_{\epsilon \to 0} \dfrac{F(y + \epsilon h) - F(y)}\epsilon$$

It will give different values depending on which $h$ is picked.

Now $$\begin{align}F(y + \epsilon h) &= \iint (y(x)+\epsilon h(x)-t)^2p(x,t)\,dxdt\\&= \iint (y - t)^2 + 2\epsilon(y-t)h + \epsilon^2h^2)p(x,t)\,dxdt\end{align}$$

which means $$\frac{\delta F}{\delta y}(h) = 2\iint (y(x)-t)h(x)p(x,t)\,dxdt$$

The function $y$ desired is the one where $\frac{\delta F}{\delta y}(h)$ is $0$ for all $h$. Which requires that $$\iint (y(x)-t)h(x)p(x,t)\,dxdt = 0$$

But this must hold for all $h(x)$, which includes* $h(x) = \delta(x - x_0)$, the Dirac delta function about $x_0$. But $$\int (y(x)-t)\delta(x - x_0)p(x,t)\,dx = (y(x_0)-t)p(x_0,t)$$ And therefore, for the stationary $y$, $$\int (y(x_0) - t)p(x_0,t)\,dt = 0$$ for every point $x_0$ Drop the $_0$ subscript I introduced for clarity, and you have the result.

* $\delta(x)$ is not a true function, but there are sequences of actual functions $\delta_n(x)$ such that $\lim_{n\to\infty} \int \delta_n(x)f(x)\,dx = f(0)$, which is sufficient.

Calculus of Variations (Pattern Recognition and Machine Learning)

There are 1 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in PARTIAL-DERIVATIVE

Related Questions in CALCULUS-OF-VARIATIONS

Related Questions in EULER-LAGRANGE-EQUATION

Trending Questions

Popular # Hahtags

Popular Questions