Expectation of a Kernel Density Estimate with and without prior knowledge about Kernel

262 Views Asked by At

Let $X_1, \dots, X_n$ be iid. random variables from an unkown density $f$. The kernel density estimation at $x$ is then defined as \begin{equation} \hat{f}(x;h) := \frac{1}{nh} \sum_{i=1}^n K\left(\frac{x-X_i}{h}\right) \tag{1}\label{a} \end{equation} where $K$ is the kernel function that is a symmetric pdf centered around 0 and with bounded variance.

With the Gaussian Kernel $$ K_G(x) := \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{1}{2} x^2 \right) $$ the equation (\ref{a}) can be rewritten as $$ \hat{f}(x;h) = \frac{1}{n} \sum_{i=1}^n \mathcal{N}(x; X_i, h^2).\tag{2}\label{2} $$

Now to my question. Let's take the expected value of Eq. (\ref{2}): \begin{align} E[\hat{f}(x;h)] &= \frac{1}{n}\sum_{i=1}^n E[\mathcal{N}(x; X_i, h^2)] \\ &= \frac{1}{n}\sum_{i=1}^n X_i \tag{3} \end{align}

Now, let's suppose we wouldn't have known that the Kernel used was the Gaussian kernel. Then the literature starts by taking the expected value for a single (???) random variable $Y$ like this: \begin{align} E[\hat{f}(x;h)] &= E \left[ \frac{1}{h} K\left(\frac{x - Y}{h}\right) \right]\\ &= \int \frac{1}{h} K\left( \frac{x - y}{h} \right) f(y) dy \\ &= \int \frac{1}{h} K(z) f(x - zh) h \text{d}z \\ &= \int K(z) f(x -zh) \text{d}z. \tag{4} \end{align}

My questions are:

  1. how do Eq. (3) and Eq. (4) relate?
  2. What's the matter with taking the expected value with regards to a single random variable $Y$? Or did I completely misunderstand that?

Here is a picture of the reference material where my second question stems from (note that $K_h(x) := \frac{1}{h}K(\frac{x}{h})$): Kernel Smoothing - Wand & Jones 1995. You can search for the book online and you can read it for free up to this point, i.e. page 14.

Thank you so much for your help!

1

There are 1 best solutions below

1
On BEST ANSWER

I think your equation (3) is incorrect. When you take expectation you should be doing so with respect to the random variables $X_1,\dots, X_n$ which you assume are drawn i.i.d. with some density $f$. Therefore, in your line above eq (3) it should be

\begin{align*} \frac{1}{nh} \sum_{i=1}^n E_{X_i} [\mathcal{N}(x; X_i, h)] = \frac{1}{h}E_{X_1} [\mathcal{N}(x; X_1, h)] \end{align*} because the $X_i$'s are i.i.d. Note that I am using the subscript to signify which variable the expectation is over and importantly, lowercase 'x' is FIXED. You seem to be treating the fixed quantity 'x' as a random variable. From here we have

$$ E_{X_1} [\mathcal{N}(x; X_1, h)] = \int \mathcal{N}(x; y,h) f(y) dy $$ which requires knowledge of the density $f$ to compute.