Random Variables Proof $P(Y_1+Y_2+Y_3+...+Y_n=k)=\frac 1 {m^n} \sum (-1)^i \binom{n}{i} \binom{k-mi-1}{n-1}$ using binomial theorem

156 Views Asked by At

$Y_1, Y_2, Y_3,...,Y_n$ are independent and identically distributed random variables and they have uniform distribution on ${\{1,...,m\}}$.

Prove that for $n\le k\le mn$, we have

$$P(Y_1+Y_2+Y_3+...+Y_n=k)=\frac 1 {m^n} \sum_{0\le i \le(k-n)/m} (-1)^i \binom{n}{i} \binom{k-mi-1}{n-1}$$

So I think this is related to the binomial formula with $$\{1+x\}^a=\sum_{i=0}^{\infty}\binom{a}{i}x^i$$ where $|a| \lt 1$. How would I transform this formula to the given identity in the question?

3

There are 3 best solutions below

0
On

This can be proven by principle of inclusion and exclusion.

Firstly, consider number of solution to $Y_1+Y_2+Y_3+...+Y_n=k$ with no constraint on upper bound, but $Y_i\ge 1$ for all $i=1,2,3,...,n$

Substitute in the variable that $Z_i=Y_i-1$, so now $Z_i\ge 0$

Equation becomes $Z_1+Z_2+...+Z_n=k-n$, number of its non-negative solutions is given by Balls in Urns, which is $ {(k-n)+n-1}\choose{n-1}$= $ {k-1}\choose{n-1}$, this gives the first term in the sum when $i=0$.

Then as we know that each of the random variable have the upper bound of $m$,

Consider the case when (at least) $i$ of the random variables exceed $m$, say $Y_1,Y_2,..,Y_i$. There are ${n}\choose{i}$ ways to choose them.

With substitution that $Z_j=Y_j-(m+1)$ when $j=1,2,...,i$,

$Z_q=Y_q-1$ when $q=(i+1),...,n$

Then $Y_1+Y_2+Y_3+...+Y_n=k \longrightarrow Z_1+Z_2+...+Z_n=k-(m+1)\cdot i -(n-i)=k-mi-n$

Again the number of solutions to this is ${(k-mi-n)+n-1}\choose{n-1}$=${k-mi-1}\choose{n-1}$, this corresponds to the $i$ th term in the sum.

So principle of inclusion & exclusion gives the sum.

3
On

Supposing that an algebraic proof is sought here we have from first principles for the probability that it is

$$\frac{1}{m^n} [z^k] (z+\cdots+z^m)^n.$$

Using basic algebra this becomes

$$\frac{1}{m^n} [z^k] z^n (1+\cdots+z^{m-1})^n \\ = \frac{1}{m^n} [z^{k-n}] \frac{(1-z^m)^n}{(1-z)^n} \\ = \frac{1}{m^n} [z^{k-n}] \frac{1}{(1-z)^n} \sum_{q=0}^n {n\choose q} (-1)^q z^{mq} \\ = \frac{1}{m^n} \sum_{q=0}^n {n\choose q} (-1)^q [z^{k-n-qm}] \frac{1}{(1-z)^n} \\ = \frac{1}{m^n} \sum_{q=0}^{\lfloor (k-n)/m \rfloor} {n\choose q} (-1)^q [z^{k-n-qm}] \frac{1}{(1-z)^n} \\ = \frac{1}{m^n} \sum_{q=0}^{\lfloor (k-n)/m \rfloor} {n\choose q} (-1)^q {k-n-qm+n-1\choose n-1} \\ = \frac{1}{m^n} \sum_{q=0}^{\lfloor (k-n)/m \rfloor} {n\choose q} (-1)^q {k-qm-1\choose n-1}.$$

This is the claim.

2
On

This answer is a mere supplement to @MarkoRiedel's answer focusing at commenting the steps due to a request of the OP.

We encode the realisation of the iid random variable $Y_j, 1\leq j\leq n$ with values in $\{1,2,\ldots,m\}$ as \begin{align*} z^1+z^2+\cdots+z^m=z\frac{1-z^m}{1-z}\tag{1} \end{align*}

We need to extract coefficients of series. It is convenient to use the coefficient of operator $[z^k]$ to denote the coefficient of $z^k$ in a series. This way the probability that $Y_j$ is $k$ can be represented according to (1) as \begin{align*} P(Y_j=k)=\frac{1}{m}[z^k]z\frac{1-z^m}{1-z}\qquad\qquad 1\leq j\leq n \end{align*}

We obtain \begin{align*} \color{blue}{P(Y_1}&\color{blue}{+Y_2+\cdots +Y_n=k)}\\ &=\frac{1}{m^n}[z^k]z^n\left(\frac{1-z^m}{1-z}\right)^n\tag{1}\\ &=\frac{1}{m^n}[z^{k-n}]\sum_{j=0}^n\binom{n}{j}(-1)^jz^{mj}\frac{1}{(1-z)^n}\tag{2}\\ &=\frac{1}{m^n}\sum_{j=0}^{\left\lfloor\frac{k-m}{n}\right\rfloor} \binom{n}{j}(-1)^j[z^{k-n-mj}]\frac{1}{(1-z)^n}\tag{3}\\ &=\frac{1}{m^n}\sum_{j=0}^{\left\lfloor\frac{k-m}{n}\right\rfloor} \binom{n}{j}(-1)^j[z^{k-n-mj}]\sum_{l=0}^\infty\binom{-n}{l}(-z)^l\tag{4}\\ &=\frac{1}{m^n}\sum_{j=0}^{\left\lfloor\frac{k-m}{n}\right\rfloor} \binom{n}{j}(-1)^j[z^{k-n-mj}]\sum_{l=0}^\infty\binom{n+l-1}{l}z^l\tag{5}\\ &\,\,\color{blue}{=\frac{1}{m^n}\sum_{j=0}^{\left\lfloor\frac{k-m}{n}\right\rfloor} \binom{n}{j}\binom{k-mj-1}{n-1}(-1)^j}\tag{6}\\ \end{align*} and the claim follows.

Comment:

  • In (1) we use the representation via power series as $n$-fold product of (1) representing the $n$ iid realisations of the variables $Y_1,Y_2,\ldots,Y_n$.

  • In (2) we apply the rule $[z^{p-q}]A(z)=[z^p]z^qA(z)$ and expand the numerator.

  • In (3) we use the linearity of the coefficient of operator apply again the rule as in (2) and set the upper limit of the series accordingly since the coefficient of $z^{k-n-mj}$ is non-negative.

  • In (4) we use the binomial series expansion.

  • In (5) we use the binomial identity $\binom{-p}{q}=\binom{p+q-1}{p-1}(-1)^q$.

  • In (6) we select the coefficient of $z^{k-n-mj}$ accordingly.