Negative Hypergeometric Distribution expectation, revisited

187 Views Asked by At

https://math.stackexchange.com/a/1798400/687323

In the explanation of the aforesaid post, I am struggling to understand why W1 W2 B1 B2 and W2 W1 B1 B2 are being counted as different permutations. Are not all white balls and black balls identical?

Furthermore, how can we label the black balls or even the white balls, does not that contradict the assumption that the black balls and white balls are identical?

I am really struggling here, and I would be greatly indebted if anyone could truly explain what is going on in the derivation of the expectation of a negative hypergeometric distribution.

Edit: I can't give up, even if this is the death of me, I can't. This is probably a mental block. I have been solving of Blitzstein, and I have solved ALL problems till the topic of expectation. However, I still can't get a hold on this derivation of expectation.

The link of book https://drive.google.com/file/d/1VmkAAGOYCTORq1wxSQqy255qLJjTNvBI/view?pli=1

PAGE: 169.

Let me explain my thinking step by step.

A random variable, roughly speaking, may be defined as a function from the set of outcomes to the set of real numbers.

So in this case we desire to find the expectation of a negative hypergeometric distribution.

In order to do so, we set up a random variable X_i defined as the number of black balls between (i-1)th white ball and the ith while ball. We, therefore, have summation of X_i = b, total number of black balls.

We desire to find E[X_1 + X_2 + ... X_r] where 'r' is the number of white of white balls drawn. We also have, X_1 + X_2 +X_3 .. X_r = k. I.E we are finding the expectation of NegHyp(w,b,r), w, white balls, b black balls, and the sampling of black balls takes place till r white balls of drawn.

Now, I also understand that X_i may be written as I_1 + I_2 + I_3 +...I_b, where each of the black balls is labelled as 1, 2, 3...b. I am in complete agreement with this labelling, as this labelling is of no consequence in the counting of permutations, and only serves as a tag of how many balls have been drawn.

But, herein lies my conundrum, something that has been troubling me for an entire week.

How is E[I_i] = 1/(w+1), that is, the probability that a ball labelled I_i lies between some two white balls = 1/w+1? I understand there are w+1 spaces between the w white balls.

But what I don't understand is how is everyone saying the probability that the I_i labelled black ball lies between a certain pair of white balls = 1/w+1 so "TRIVIALLY". Because there are not just w+1 positions that we need to consider, but also a whole host of other permutations, overcounts, etc.

I would be grateful to anyone who could be so ever kind and actually detail out the process of calculating the above probability. Please kindly don't use the word "symmetry", I really don't get it.

2

There are 2 best solutions below

9
On BEST ANSWER

For determining the number of black balls drawn before drawing any white balls using an indicator variable,

we are not considering the entire string, but a string composed of all the white balls in conjunction with any one black ball, say black ball $j$

Thus all $w+1$ positions for black ball $j$ are equiprobable, and the probability that the black ball $j$ is first in the list is $\Large\frac{1}{w+1}$

The explanation of the book should now be clear.


Added in reponse to query

  • We are using the method of indicator variables, where a variable is assigned either a probability of $1$ or $0$, and thus the expectation of an indicator variable is just the probability of the event it indicates, and with $X_j$ being an indicator variable that indicates whether it comes befor all white balls or not,
    $\Bbb {E[X_j]} = P[X_j] =\Large\frac 1{w+1}$

  • We then apply (expectation of sum) = (sum of expectations), which applies even when the variables are not independent.

6
On

You want to find the expected value of black balls chosen (before any whites) so you calculate that by summing up the probability of a black ball being chosen at each trial.

So the labeling happens really only in the indicator pmf definition. The indicator function is defined as picking black ball labeled j before any white ball is chosen. So only here the labels are necessary.

But you want to find the expected value of black balls chosen before any white. Here the labels from before are irrelevant. So that EV is equal to the probability of picking black each trial. Or: $$ P(I_1)+P(I_2)+\ldots+P(I_b) $$

Since each $P(I_k)$ is the same: The $\sum_0^b P(I_j)$ is the probability a black ball is chosen at each trial and when you add those together it's the Expected Val.