The correct physical interpretation of Binomial distribution and bernoulli trial in this example

181 Views Asked by At

We know that every random variable can have a probability distribution. Examples include the number of heads in many tosses, or the number of ones on a dice after many rolls and so on.

Suppose we use the binomial distribution to model this random variable. Let us take an example. We are tossing a single coin $100$ times and checking for heads. By plugging this into the binomial distribution I get a beautiful graphical representation of probability on one side and the no. of heads on the other axis. This graph would peak at $50$ heads with a roughly $0.08$ probability.

However, there is also a physical interpretation. It basically means I toss coin a hundred times, and note the number of heads. Then I repeat this experiment thousands and thousands of times, and note the frequency of the occurance of each number of head. This frequency represents the probability or the height of the graph, in the binomial distribution graph. As one would expect, $50$ heads would appear roughly $8$ percent of the time. This can be shown easily with computer simulation, as 3blue1brown does with random number generators.

Now we have a mathematical as well as a physical meaning of what the binomial distribution represents.

Now Imagine the following scenario.

There is a bag with $100$ balls inside it. Some of them are blue, red and other colours. We don't know how many are of each colour. However, the colours are fixed ofcourse, we just don't know the amount.

What we do is, we pick a single ball at random, and note its colour. This is repeated many many times, and it appears that blue balls appear $20$ percent of the time. Since it is impossible to repeat the experiment infinite times, we can never know what is the exact percentage of blue balls appearing. Since we know the total number of balls, but not the exact probability of getting a blue ball, we can never know the exact number of blue balls inside our bag.

Hence the number of blue balls inside our bag is a random variable, and thus, it must have a distribution. In our sampling, we found out that the probability of getting a blue ball was $0.2$. This is not the true probability of getting a blue ball from that bag, more like our best estimate of the true probability.

Hence we can use binomial distribution to find the probability of different numbers of blue balls being inside the bag. The mean of this distribution would be $20$ balls, and as the total number of trials tend to infinity, the actual number of blue balls would tend towards this mean. However, in our trials, the probability of getting $20$ blue balls, would be about $9.93 $ percent.

Now mathematically this is all well and good. However, physically it doesn't seem to make sense.

Let us see, how would we interpret this binomial distribution in the physical sense, just like we did for our coin tosses. In case of the coin tosses, we did the experiment many many times and noted the frequency of a particular number of heads, and we used this to create a distribution.

Suppose, we do the same thing here. So, we empty the bag and count the number of blue balls and repeat this experiment many many times. According to the binomial distribution, in about $9.93$ percent of the cases, I should get $20$ blue balls out of the bag. In other cases, I'd get other results with different probabilities. However, if I'm doing the experiment with the same bag this creates a problem, since even though I don't know the number of blue balls in the bag, I do know that it is a constant. The same bag cannot give two different number of blue balls in two consecutive experiments.

So, the physical interpretation of the binomial distribution seems to fail over here.

One solution that I can think of is, instead of checking the same bag again and again, to get a frequency, what if I check thousands of different bags with different number of blue balls. Each of them would have a different number of blue balls in them from $0$ to $100$. However, the same bag won't have different blue balls in consecutive throws, because we are not checking the same bag, we are checking different bags. Since we don't know the exact number of blue balls in the bag, we essentially don't know which bag is it out of all these bags.

So, the binomial distribution is no longer about the number of blue balls in the same bag, directly. It is more about the different bags with different number of blue balls in them. So in a sense, the number of blue balls is not exactly the random variable, in our problem as we initially guessed. Its actually the bags, that is the random variable. Different bags have different number of blue balls, and we basically don't know which bag is the real one. To say that, in $9.93$ percent of the times, a total of $20$ blue balls appear, would be equivalent to saying that $9.93$ percent of the time, bags with $20$ blue balls turn up. This is correct, because the bags with $20$ blue balls in them, would be more likely to give us a $20$ percent of picking up a random blue ball. Bags with $100$ blue balls or $99$ would be less likely to give us a $20$ percent chance of picking a blue ball.

Would this be the correct physical interpretation of the binomial distribution ? Instead of the bernoulli trial being checking a single bag for number of blue balls, each bernoulli trial is basically checking all these different bags. I'm doing all this, because I single bag cannot give two different numbers of blue balls in successive bernoulli trials, even if we don't know the exact number of balls. So the question should be more like there are several bags with different number of blue balls from $0$ to $100$, given the probability of picking up a random blue ball is almost $0.2$, which one of these bags is most probable and so on. Hence, bags with $20$ blue balls would be the mean of this distribution of different bags. We are essentially checking how likely a certain bag would give us exactly $20$ percent chance of picking up a blue ball at random, since that is the only information we have.

Is this interpretation correct ? Mathematically it doesn't make a difference, since the binomial distribution formula describes both the physical cases equally. If the colour of the ball was not constant, and we were checking the same bag, I'd have got the exact same results. However, the philosophical and the physical interpretations are somewhat different. Like tossing a single coin $100$ times vs tossing $100$ coins once. Mathematically it is the same, physically not so.

Thanks for your time.

2

There are 2 best solutions below

26
On BEST ANSWER

Let $n = $ any large number : say $1000$.

Let $b$ denote the number of blue balls in the bag.

Let $f(b)$ denote the probability of exactly $20\%$ of the $n$ trials succeeding in showing a blue ball, when a ball is selected with replacement from the bag.

Let $W$ denote $\displaystyle \sum_{i = 0}^{100} f(b)$.

Then, the expected number of blue balls in the bag is

$$\frac{\sum_{i=0}^{100} \left[i \times f(i)\right]}{W}.\tag1 $$

$W$ in the denominator serves to normalize the sum of the weights (i.e. the probabilities) associated with each possible number of blue balls.

$\displaystyle f(b) = \binom{1000}{200} \times \left[\frac{b}{100}\right]^{(200)} \times \left[\frac{100 - b}{100}\right]^{(800)}.$

0
On

As discussed in the chat with user2661923, there is a second way of solving this, which works best if the number of balls in the bag is very large.

Think of the following scenario. Suppose, you have $N$ balls in the bag, out of which $b$ are blue. So, what would be the true probability of picking up a blue ball from the bag - it would obviously be $b/N$. Now suppose, you pick up $N$ balls from the bag with replacement. Moreover, say that it turns out that $K$ of these balls turn out to be blue. Now repeat the experiment many many times, and you'll get some other $\epsilon$ number of blue balls out of $N$. According to statistics, most of the times, these $K$ or $\epsilon$ would lie within one standard deviation of the mean. This would roughly follow a gaussian distribution about the mean.

For comparison, consider the number of heads that turn up in a $100$ coin toss. Roughly $70$ percent of the time, you'd get between $45$ and $55$ heads. For $10000$ coin tosses you'd get between $4070$ and $5030$ roughly $95$ percent of the time. Similarly, for $1000000$ tosses, you'd get between $499000$ and $501000$ coins $95$ percent of the time. As you can see, for larger and larger values, the range is closer and closer to the mean. For extremely large values, we can approximate that that we get exactly the mean.

Now, what is the mean ? Well, obviously it is the total number of trials multiplied by the probability of 'success' in a single trial. Since in our case, the number of trials is exactly the total number of balls in our bag, we can say that the mean is equal to $b$, the original number of blue balls in our bag.

The approximation now is, if the total number of balls in our bag, and hence the total number of trials is extremely big, we can ignore this standard deviation, and say with reasonable accuracy, that the number of blue balls that we get i.e. $K$ or $\epsilon$ would be approximately equal to $b$.

Hence, $K \approx \epsilon \approx b$

Hence, for large systems, if we pick up $N$ balls with replacement, and found that exactly $b$ of them were blue, we could approximately say, using the above reasoning, that there were originally $b$ blue balls in the system of total $N$ balls.

Hence the probability of picking up $b$ blue balls at random, would give us indirectly, the probability that there were $b$ blue balls in the system. As we have already seen by random sampling, the probability of getting a blue ball randomly is given by $p_i$. Remember, this is not equal to $b/N$, since we don't know $b$. However, using this probability, we can find out the distribution for the number of blue balls that we get, if we randomly sample $N$ balls with replacement. This would be a simple binomial distribution as you know. (Picking up coloured balls from a bag with replacement, follows a binomial distribution).

However, using the reasoning above, we can claim that the number of blue balls that we get out of $N$ total balls, would be approximately equal to the number of blue balls in the bag. Hence, the binomial distribution for the number of blue balls that we get by picking up $N$ balls with replacement, should indirectly and approximately, give us the distribution of number of blue balls in the bag.

For example, if there are $100$ balls, and we pick up $20$ blue balls, our reasoning suggest that there must have been $20$ blue balls in the bag originally. So, the probability of picking up exactly $20$ out of $100$ blue balls, would also give us the probability that there were $20$ blue balls. Similarly, we check the probabilities for all number of blue balls and create a distribution. Ofcourse, this is much more accurate when the number of balls is huge.

With this being said, remember the answer posted above is the correct and most intuitive way of solving this problem, and this current answer is nothing but an approximation, albeit a reasonable one, in some cases.

Hence to summarize, the two ways of physically reasoning about this problem is as follows :

  1. Since you only know the expected probability of getting a single blue ball, from a bag of $N$ balls, you calculate the mean. Then you consider $N+1$ different bags with different number of blue balls in them, and check how likely are these bags of giving you the mean that you calculated for the original bag. This would give you a probability distribution of these bags chances of being the original bag. Since these bags have a defined number of blue balls in them, this automatically gives you the probability distribution for the number of blue balls in the original bag.

  2. Again, you know the expected probability of getting a single blue ball, out of $N$. What you do is, create a binomial distribution for getting $0,1,2,3.....N$ blue balls if you pick a $N$ balls from the bag with replacement. Then you use the argument, that if the number of balls is large, the number of blue balls that we get if we pick $N$ balls with replacement, would be approximately equal to the mean, which in this case would be the actual number of blue balls in the system. So, say you get $m$ blue balls in a trial, that would mean there are $m$ blue balls in the system. Hence the probability of getting $m$ out of $N$ blue balls would be the same thing as the probability of having $m$ blue balls in the original system.

An analogous problem would be, suppose I have a die, that I roll many-many times and get a $1/6$ chance of rolling a six, what is number of faces of the die that is marked $6$ ? This becomes a similar problem, where we have to compare $7$ dice, with $0$ to all faces marked $6$, and check which one is most likely to be our die.

Based on my limited understanding, this is used when we find the distribution of an estimator, a sort of probability of probability. On the other hand, if you know the exact probability of picking up a die, and rolling it and getting a six, you can use a simple binomial distribution to get the chances of rolling a certain number of sixes or something.