Suppose I have $n$ independent random variables $X_1,\dots, X_n$ drawn from a Bernoulli distribution with probability $p$. Furthermore, I have $n$ coefficients labeled $a_1,\dots, a_n$.
What is the expected value and variance for the following weighted average?
\begin{equation} m=\frac{\sum_{i=1}^n a_i X_i}{\sum_{i=1}^n X_i} \end{equation}
Since the random variables in the numerator and denominator are clearly correlated, I can not simply take the expected values independently, however, logically the expected value should equal to the mean $E[m] = \bar{a} =\frac{1}{n}\sum_{i=1}^n a_i$. How can I show that mathematically and how would I proceed to find the variance of $m$?
Thank you.
As @Andrew Zhang mentions, $m$ is ill-defined if it happens that the realization of $(X_1, \dots, X_n)$ is $(0, \dots, 0)$ as we then get $0$ in the denominator.
However, by formally setting $m$ in that situation to be $\bar{a} $, it actually turns out that $\mathbb{E}[m] = \bar{a}$.
This can be shown by brute-force calculation: As the $X_i$'s are independent, by grouping in the number of $X_i$'s that are $1$, we get
\begin{align} \mathbb{E}[m] &= (1-p)^n \cdot \bar{a} + \sum_{k = 1}^n p^k \cdot (1-p)^{n-k} \cdot \left(\sum_{I \in [n]^{(k)}} \frac{\sum_{i \in I} a_i}{k}\right),\\ &= (1-p)^n \cdot \bar{a} + \sum_{k = 1}^n p^k \cdot (1-p)^{n-k} \cdot \frac{1}{k}\left(\sum_{I \in [n]^{(k)}} {\sum_{i \in I} a_i}\right), \end{align} where $[n]^{(k)}$ is the set of $k$-element subsets of $[n]= \{1,2, \dots, n\}$.
Let's analyze the inner term: We add $a_i$ for every $I \in [n]^{(k)}$ that contains $i$. There are exactly $\binom{n-1}{k-1}$ such $I$'s, so
\begin{align} \frac{1}{k}\sum_{I \in [n]^{(k)}} {\sum_{i \in I} a_i} &= \frac{1}{k} \sum_{i = 0}^{n} \binom{n-1}{k-1} a_i \\ &= \frac{1}{k} \sum_{i = 0}^{n} \frac{(n-1)!}{((n-1)-(k-1))! (k-1)!}\cdot a_i \\ &= \sum_{i = 0}^{n} \frac{n!}{(n-k)! k!}\cdot \frac{a_i}{n} \\ &= \binom{n}{k}\cdot \sum_{i = 0}^{n} \frac{a_i}{n} \\ &= \binom{n}{k}\cdot \bar{a}. \end{align}
Thus, using the Binomial Theorem, we get
\begin{align} \mathbb{E}[m] &= \sum_{k = 0}^n p^k \cdot (1-p)^{n-k} \cdot \binom{n}{k}\cdot \bar{a} \\ &= \bar{a} \sum_{k = 0}^n \binom{n}{k} p^k \cdot (1-p)^{n-k} \\ &= \bar{a}. \end{align}
The variance can be calculated in a similar way:
\begin{align} & & \mathbb{E}[m^2] &= (1-p)^n \cdot (\bar{a})^2 + \sum_{k = 1}^n \frac{ p^k (1-p)^{n-k}}{k^2} \left( \sum_{i = 1}^n \binom{n-1}{k-1} a_i^2 - 2 \sum_{1 \leq i < j \leq n} \binom{n-2}{k-2} a_i a_j \right) \\ \implies{}& & \mathbb{V}(m) &= \mathbb{E}[m^2] - (\mathbb{E}[m])^2 \\ & & &= (1-p)^n \cdot (\bar{a})^2 + \sum_{k = 1}^n \frac{ p^k (1-p)^{n-k}}{k^2} \left( \sum_{i = 1}^n \binom{n-1}{k-1} a_i^2 - 2 \sum_{1 \leq i < j \leq n} \binom{n-2}{k-2} a_i a_j \right) - (\bar{a})^2. \end{align}