I became stuck with the following Calculus of Variations problem. The problem is related with something called as the "Nadaraya-Watson" model in statistics. We have $N$ inputs ${x_n}$ and each of these inputs perturbed with a noise variable $\xi$. For each input $x_n$ we have a target variable $t_n$. $\xi$ comes from a probability distribution $\nu(\xi)$. Now the following sum-of-squares error is defined:
$$E = \dfrac{1}{2}\sum_{n=1}^{N}\int{(y(x_n + \xi) - t_n})^2\nu(\xi)d\xi \tag{1}$$
The aim is to find the function $y(x)$ which minimizes this error. My approach was to calculate the Gateaux variation $\dfrac{dE[y(x) + \epsilon \eta(x)]}{d \epsilon}|_{\epsilon = 0}$ and to find a $y(x)$ which makes the variation $0$ for all $\eta(x)$.
I calculate the variation as:
$$ \sum_{n=1}^{N}\int{(y(x_n + \xi) - t_n})\nu(\xi)\eta(x_n + \xi)d\xi \tag{2}$$
which looks correct according to the solution set.
But I am completely clueless how to find a $y(x)$ which makes this variation $0$ for each possible $\eta(x)$!
The answer is supposed to be:
$$y(x)=\sum_{n=1}^{N}t_nh(x-x_n) \tag{3}$$
Where $h(x-x_n)$ is defined as:
$$h(x-x_n) = \dfrac{\nu(x - x_n)}{\sum_{n=1}^{N}\nu(x - x_n)} \tag{4}$$
But this $y(x)$ does NOT even make the variation in $(2)$ zero for any $\eta(x)$!
Am I missing something? I need help of any kind...
I think your expression (2) is off by a factor of 2 relative to the actual variation, but being a constant factor, that is irrelevant.
Since (2) must be zero for all $\eta(x)$, in particular, it is zero for $$ \eta(x) = \delta(x-u)$$ for any given $u$. And in fact, since (2) is linear in $\eta$, if we find a $y(x)$ that causes (2) to vanish for
$ \eta(x) = \delta(x-u)$ for all values of $u$, then we can integrate to get any arbitrary $\eta(x)$.
When you plug $\eta(x) = \delta(x-u)$ into (2) you get $$ \sum_n(y(u)-t_n)\nu(u-x_n) = 0 \\ \sum_n y(u) \nu(u-x_n) = \sum_n t_n(u-x_n) $$ I used $u$ earlier to avoid confusion with the variable $x$, but at this point we can replace $u$ by $x$ to get $$ \sum_n y(x) \nu(x-x_n) = \sum_n t_n(x-x_n) $$ from which it is easy to see that the purported solution works, and not that much harder to find the solution without being told what it will be.