I have an experiment where each "run" of the experiment has a binomial distribution. In a given run I have a number of trials $N_i$ and probability of success $p_i$. The result is number of successes $S_i$ which is a sample from the binomial distribution. For this single run of the experiment, I know the variance is $N_i p_i(1-p_i)$.
In a different run the probability of success and the number of trials changes. Call these $N_j$ and $p_j$.
The number of trials and success probabilities are in turn drawn from their own distributions, so each $N_j$ and $p_j$ is a sample from its own distribution.
If I know the distribution of the success probabilities and the distribution of the number of trials, then what is the distribution of the entire set of runs? I'm most interested in the mean and the variance of the set of runs.
In essence, I have a set of samples all drawn from different (but related) binomial distributions. I want to know the mean and variance of this set. I think this can be thought of as a compound distribution: https://en.wikipedia.org/wiki/Compound_probability_distribution
For the purpose of this question, let's say that the distribution of the success probabilities $p_i$ is Beta with some mean and variance: $p\sim (\mu_p,\sigma^2_p)$, and the distribution of the number of trials is Gaussian: $N\sim \mathcal{N}(\mu_N,\sigma^2_N)$.
I was initially thinking to solve this as a special case of the Poisson binomial distribution, where I sum over the total number of trials and I get something like $\sigma^2 = \sum_{i=1}^{M_{trials}}N_ip_i(1-p_i)$ for the variance and $\mu = \sum_{i=1}^{M_{trials}}N_ip_i$ for the mean. But this isn't really useful since I have lots of different "runs" and I do know the distributions of the number of trials and the success probabilities. It seems like I should be able get something more compact. Ideally, I would have an expression for the variance of the set of runs in terms of the means and variances of $N$ and $p$.
For a set of runs, each with variance $N_i p_i(1-p_i)$ should I calculate the variance of the quantity $N_i p_i(1-p_i)$ instead of taking the sum? This is the variance of the variance, and it doesn't really seem like the correct thing to do. I'm stuck on how I can express the sum $\sigma^2 = \sum_{i=1}^{N_{total}}N_ip_i(1-p_i)$ as something more compact when I know the distributions of N and p.
One thing that I have been stumbling on is that my variance, $\sigma^2 = \sum_{i=1}^{N_{total}}N_ip_i(1-p_i)$ appears to be expressed as a sum of random variables: $N,p$. In reality, though, it is expressed as a sum of samples of random variables.
The best I've been able to do so far is compute the expected value of the variance. I'm not sure this is the correct approach.
We have 2 Gaussian random variables $N\sim\mathcal{N}(\mu_N,\sigma^2_N)$ and $p\sim\mathcal{N}(\mu_p,\sigma^2_p)$. They are independent of one another. We have an expression for the variance of an experiment in terms of samples of these random variables $$\sigma^2 = \sum_i^{M_{runs}} N_i p_i(1-p_i)$$ where M is the number of `runs' of the experiment. M is very large.
So the expected value of this variance is
\begin{align} \mathbb{E}[\sigma^2] &= \sum_i^{M_{runs}}\mathbb{E}[ N_i p_i(1-p_i)] \\ &=\sum_i^{M_{runs}}\mathbb{E}[ N_i] \mathbb{E}[ p_i(1-p_i)] \\ &=\sum_i^{M_{runs}}\mathbb{E}[ N_i]( \mathbb{E}[ p_i]-\mathbb{E}[p_i^2])\\ &=\sum_i^{M_{runs}}\mathbb{E}[ N_i](\mathbb{E}[p_i]-[\sigma^2_{pi}+\mathbb{E}[p_i]^2])\\ &=\sum_i^{M_{runs}}\mu_N(\mu_p-[\sigma^2_{p}+\mu_p^2])\\ &= M\mu_N(\mu_p- \mu_p^2-\sigma^2_{p})\\ &=M\mu_N(\mu_p(1-\mu_p)-\sigma^2_p) \end{align}
I'm not really satisfied with this answer, to be honest. One thing that bothers me in particular is that the variance of my set of measurements decreases as the variance of the `success probabilities' increases. This can't be right!