expectation value of a random variable over a transformation of uniform distribution

66 Views Asked by At

Consider an array of size $m$ where each element is sampled from a uniform distribution over $[0,1]$, e.g., $\mathbf{p}=(0.24,0.03,0.85)$ for $m=3$. Now we sample from this array $\mathbf{p}$ in the fashion of Bernoulli's, e.g., $(0,0,1)$, that is, each element takes the value 1 with probability given from the respective element of $\mathbf{p}$ and the value 0 otherwise. We repeat $n$-times sampling the 0-1 valued arrays \begin{gather*} (0,0,1)\\ (1,0,1)\\ (0,1,1)\\ (0,0,1)\\ \vdots \end{gather*} Then we construct an array $\mathbf{q}$ counting how many times each 0-1 valued array occurs. For eaxmple, $(0,0,1)$ may occur 10-times and (1,0,1) occurs $25$-times for $n=100$. Then we sort the array $\mathbf{q}$ in the decreasing fashion and sum the first $k$ values, denoting as r. What is the expectation value of $r/n$?

Here is a Colab notebook experimenting on the setting $m=10,n=10^4,k=10$ which gives $r/n=0.304(4)$.

1

There are 1 best solutions below

2
On

Let $X$ represents the outcome of one entry in one $m-$dimensional array.

Then $X|P=p\sim Bernoulli(p)$ where $P\sim \mathcal{U}[0,1]$. Using the total law of probability, $$P(X=1)=\int_{0}^1P(X=1|P=p)f_P(p)dp=\int_{0}^1pdp=\frac{1}{2}$$ This shows $X\sim Bernoulli(1/2)$. Now suppose we take $X_1,\dots ,X_m$ to be identical copies of $X$. By independence we see $$P\Big((X_1,\dots ,X_m)=(x_1,\dots ,x_m)\Big)=\frac{1}{2^m}$$ for any $(x_1,...,x_m)\in \{0,1\}^m$. Next, for fixed $(x_1, \dots ,x_m)\in\{0,1\}^m$, take $Y_{(x_1, \dots ,x_m)}$ to be a random variable counting the number of times $(x_1, \dots ,x_m)$ appears in your sample of $n$ $m-$ dimensional arrays. Then $Y_{(x_1, \dots ,x_m)}\sim Binomial\Big(n,\frac{1}{2^m}\Big)$ and so $$\mathbb{E}\Big(Y_{(x_1, \dots, x_m)}\Big)=\frac{n}{2^m}$$ Now if $E\subseteq \{0,1\}^m$ has size $k\in[2^m]$ and $R=\sum_{(x_1, \dots ,x_m)\in E}Y_{(x_1, \dots, x_m)}$ then we have $$\mathbb{E}\bigg(\frac{R}{n}\bigg)=\frac{1}{n}\sum_{(x_1, \dots , x_m)\in E}\mathbb{E}\Big(Y_{(x_1, \dots, x_m)}\Big)=\frac{k}{2^m}$$

Any suggestions, recommendations, advice, etc. are welcome.