Approximate a density function from sampled data

209 Views Asked by At

Let

  • $(E,\mathcal E,\mu)$ be a measure space
  • $E_0\in\mathcal E$ with $\mu(E_0)\in(0,\infty)$ and $\mathcal E_0\subseteq\left.\mathcal E\right|_{E_0}$ be finite and disjoint with $$E_0=\biguplus\mathcal E_0$$
  • $f:E\to[0,\infty)$ be $\mathcal E$-measurable with $$c:=\int f\:{\rm d}\mu\in(0,\infty)$$ and $$f_0:=\frac1{\mu(E_0)}\int_{E_0}f\:{\rm d}\mu$$
  • $\nu:=c^{-1}f\mu$
  • $\delta$ denote the Dirac kernel on $(E,\mathcal E)$ and $\delta_x:=\delta(x,\;\cdot\;)$ for $x\in E$

Let $n\in\mathbb N$ and $x_1,\ldots,x_n\in E$ be mutually independent samples drawn from $\nu$. I would like to approximate $\left.f\right|_{E_0}$ using these samples in a similar way as described here: How can we approximate a function by sampling a distribution proportial to it and making a histogram of samples?.

The assumption is that we're not able to compute $c$, but the average $$f_0:=\frac1{\mu(E_0)}\int_{E_0}f\:{\rm d}\mu$$ of $f$ over $E_0$ with respect to $\mu$.

Moreover, I guess it is assumed that $\left.f\right|_B\approx f_B$ for some $f_B\ge0$ for all $B\in\mathcal E_0$. Then $$\int_Bf\:{\rm d}\mu=\mu(B)f_B\Leftrightarrow f_B=c\frac{\nu(B)}{\mu(B)}\;\;\;\text{for all }B\in\mathcal E_0.\tag1$$ Letting $X_1,\ldots,X_n$ be identically distributed mutually independent random variables on a common probability space with $X_1\sim\nu$ and $\zeta_n:=\sum_{i=1}^n\delta_{X_i}$ we know that $$\frac1n\zeta_n(B)\xrightarrow{n\to\infty}\nu(B)\;\;\;\text{almost surely for all }B\in\mathcal E\tag3$$ by the strong law of large numbers. In the sense of $(3)$, $$\nu(B)\approx\frac1n\zeta_n(B)\;\;\;\text{for all }B\in\mathcal E.\tag4.$$ Now we may write $$c=\frac{\mu(E_0)}{\nu(E_0)}f_0\tag5$$ and hence $$f_B=\frac{\mu(E_0)}{\nu(E_0)}f_0\frac{\nu(B)}{\mu(B)}\;\;\;\text{for all }B\in\mathcal E.\tag6$$ Noting that $$\frac1n\sum_{B\in\mathcal E_0}\zeta_n(B)=\frac1n\zeta_n(E_0)\approx\nu(E_0)\tag7,$$ we may approximate $f(x)$ by $$\tilde f(x):=\sum_{B\in\mathcal E_0}1_B(x)f_B=\frac{\mu(E_0)f_0}{\sum_{B\in\mathcal E_0}\zeta_n(B)}\sum_{B\in\mathcal E_0}1_B(x)\frac{\zeta_n(B)}{\mu(B)}\tag8$$ for all $x\in E_0$.

However, in the link above they suggest to compute the average $h$ of samples belonging to an element (a "bin" so to say) of $\mathcal E_0$, i.e. $$h=\frac1{|\mathcal E_0|}\sum_{B\in\mathcal E_0}\zeta_n(B)\tag9$$ and approximate $f(x)$ by $$\hat f(x):=\frac{f_0}h\sum_{B\in\mathcal E_0}1_B(x)\zeta_n(B)\tag{10}.$$ Could anyone make sense of $(10)$ for me? I don't understand why this is a sensible approximation (why taking the average $h$?)

We may note that $\zeta_n(B)$ is the (random) number of samples lying in $B\in\mathcal E$ and $\sum_{B\in\mathcal E_0}\zeta_n(B)$ is the number of samples lying in $E_0$.

EDIT: The crucial thing is surely the following: The histogram given by $$h(x):=\sum_{B\in\mathcal E_0}1_B(x)\zeta_n(B)\;\;\;\text{for }x\in E_0$$ is an approximation of the "shape" of $\left.f\right|_{E_0}$. So, the only left is to "scale" it the right way ...