Checking independence of random variables.

339 Views Asked by At

I'm revisiting the coupon collector's problem and I'm not sure how to prove that my variables are independent. Here's what I have:

Let $X$ denote the number of tries required to collect all the coupons, and $X_i$ denote the number of tries required to collect a new coupon, after having collected $i-1$ coupons. Then $X_i \sim Geom(p_i)$ with $p_i= \frac{n-i+1}{n}$.

Intuitively they should be independent but how would one go about proving this? Would calculating $E[X_i X_j]$ and showing it equal to $E[X_i]E[X_j]$ be enough? Is this the right way to proceed?

2

There are 2 best solutions below

3
On BEST ANSWER

It's certainly not enough to prove that $E[X_iX_j] = E[X_i]E[X_j]$ It's quite easy to construct dependent random variables that satisfy the above. Suppose $D$ is the result when I through I throw a single die and set $$\begin{array}{r|cccccc}D&1&2&3&4&5&6 \\ \hline X & -5&-4&-3&3&4&5 \\Y & 5&-4&-3&3&4&-5 \end{array}$$

Clearly not independent but $E[XY] = E[X]E[Y]$ because that's how I chose my random variables, $X$ and $Y$ are independent if and only if for every pair $x,y$ we have $P\{X=x,Y=y\} = P\{X=x\}P\{Y=y\}$.

To show independence of an infinite list of random variables you need to do a bit more work.

We need to show that finite $n$ and any sequence $x_1,\dots,x_n$ we have $$P\{X_1=x_1,\dots X_n = x_n\} = P\{X_1=x_1\}\times\dots\times P\{ X_n = x_n\} $$

This is obviously true for $n=1$.

Now assume inductively that the above is true for $n=j$ and set $s_j =\sum_{i=1}^j x_i$ and pick an arbirtary test value of $x_{j+1}$

The event $\{X_1=x_1,\dots X_j = x_j\}$ occurs in the first $s_j$ draws Given that this does occur the event $\{X_{j+1}= x_{j+1}\}$ only involves the results of draws after $S_j$ so these two events are independent.

Therefore $$\begin{array}{rcl} P\{X_1=x_1,\dots X_{j+1} = x_{j+1}\} &=& P\{X_1=x_1,\dots X_{j} = x_{j}\}P\{X_{j+1}=x_{j+1}\} \\ &=& P\{X_1=x_1\}\times\dots\times P\{ X_{j+1} = x_{j+1}\}\end{array}$$

Therefore by induction the statement holds for every $n$ and the sequence of random variables $X_1,X_2,\dots$ is independent.

0
On

From Mood, Graybill and Boes, Introduction to the theory of statistics, Pp. 150, Third Ed., 1974:

``...supose $X$ and $Y$ are two independent random variables, then $f_{X,Y}(x,y)=f_{x}(x)f_{y}(y)$ by definition of independence; however $f_{X,Y}(x,y)=f_{Y|X}(y|x)f_{X}(X)$ by definition of conditional density, which implies that $f_{Y|X}(y|x)=f_{Y}(y)$; that is, the conditional density of $Y$ given $x$ is the conditional density of $Y$''.

In your example, find the conditional density function for $p_i$ and $p_j$. If conditional density of $p_i$ given $p_j$ equals marginal density of $p_i$, then they are stochastically independent.

If you have the conditional density $f_{X,Y}(x,y)$, another way to show $X$ and $Y$ are independet is to rewrite $f_{X,Y}(x,y)=g(x)h(y)$, if you can find both g(x) and h(y), then $X$ and $Y$ are independent.