Expectation of a Kernel Density Estimate with and without prior knowledge about Kernel

Question

Expectation of a Kernel Density Estimate with and without prior knowledge about Kernel

262 Views Asked by Bumbble Comm At 07 Apr 2026 - 6:58

Let $X_1, \dots, X_n$ be iid. random variables from an unkown density $f$. The kernel density estimation at $x$ is then defined as \begin{equation} \hat{f}(x;h) := \frac{1}{nh} \sum_{i=1}^n K\left(\frac{x-X_i}{h}\right) \tag{1}\label{a} \end{equation} where $K$ is the kernel function that is a symmetric pdf centered around 0 and with bounded variance.

With the Gaussian Kernel $$ K_G(x) := \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{1}{2} x^2 \right) $$ the equation (\ref{a}) can be rewritten as $$ \hat{f}(x;h) = \frac{1}{n} \sum_{i=1}^n \mathcal{N}(x; X_i, h^2).\tag{2}\label{2} $$

Now to my question. Let's take the expected value of Eq. (\ref{2}): \begin{align} E[\hat{f}(x;h)] &= \frac{1}{n}\sum_{i=1}^n E[\mathcal{N}(x; X_i, h^2)] \\ &= \frac{1}{n}\sum_{i=1}^n X_i \tag{3} \end{align}

Now, let's suppose we wouldn't have known that the Kernel used was the Gaussian kernel. Then the literature starts by taking the expected value for a single (???) random variable $Y$ like this: \begin{align} E[\hat{f}(x;h)] &= E \left[ \frac{1}{h} K\left(\frac{x - Y}{h}\right) \right]\\ &= \int \frac{1}{h} K\left( \frac{x - y}{h} \right) f(y) dy \\ &= \int \frac{1}{h} K(z) f(x - zh) h \text{d}z \\ &= \int K(z) f(x -zh) \text{d}z. \tag{4} \end{align}

My questions are:

how do Eq. (3) and Eq. (4) relate?
What's the matter with taking the expected value with regards to a single random variable $Y$? Or did I completely misunderstand that?

Here is a picture of the reference material where my second question stems from (note that $K_h(x) := \frac{1}{h}K(\frac{x}{h})$): Kernel Smoothing - Wand & Jones 1995. You can search for the book online and you can read it for free up to this point, i.e. page 14.

Thank you so much for your help!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

I think your equation (3) is incorrect. When you take expectation you should be doing so with respect to the random variables $X_1,\dots, X_n$ which you assume are drawn i.i.d. with some density $f$. Therefore, in your line above eq (3) it should be

\begin{align*} \frac{1}{nh} \sum_{i=1}^n E_{X_i} [\mathcal{N}(x; X_i, h)] = \frac{1}{h}E_{X_1} [\mathcal{N}(x; X_1, h)] \end{align*} because the $X_i$'s are i.i.d. Note that I am using the subscript to signify which variable the expectation is over and importantly, lowercase 'x' is FIXED. You seem to be treating the fixed quantity 'x' as a random variable. From here we have

$$ E_{X_1} [\mathcal{N}(x; X_1, h)] = \int \mathcal{N}(x; y,h) f(y) dy $$ which requires knowledge of the density $f$ to compute.

Expectation of a Kernel Density Estimate with and without prior knowledge about Kernel

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in EXPECTED-VALUE

Related Questions in DENSITY-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions