I am getting confused at how to calculate the average probability.
Suppose we repeat a kind of binary survey $k$ times each of which was done on a completely separate sample group.
For each $i^{th}$ group, $i=1,2,3,...,k$, Let $n_i$ be the number of samples and $s_i$ be the number of positive results. With this we know that the probability $p_i$ of the positive result for the $i^{th}$ group is $s_i/n_i$
With my ignorance, I happened to use two ways to calculate the average probability in my work wrongly assuming that they are the same:
$$\frac{p_1+p_2+p_3+...+p_k}{k}$$ and $$\frac{s_1+s_2+s_3+...+s_k}{n_1+n_2+n_3+...+n_k}$$
I don't know which is the correct way to calculate the average probability. So could you please explain the difference of these two and when to use which?
Thanks to a hint by Andre Nicolas, I have figured out that the second formula is simply the unequally weighted version of the first in which we give more weights to the probabilities that were inducted from the greater number of samples.
$$\frac{s_1+s_2+s_3+...+s_k}{n_1+n_2+n_3+...+n_k}=w_1p_1+w_2p_2+w_3p_3+...+w_kp_k$$
where $w_i=n_i/(n_1+n_2+n_3+...+n_k)$.
Therefore I believe the second formula is the more accurate version and should be used whenever possible. Therefore we should use the second formula if every samples from every groups are of the same importance.