Joint distribution of random vectors on a multinomial distribution

298 Views Asked by At

I am having some problems with this proof, I don't know how to start, I think that it's related to the joint distribution. Honestly, I don't have a clue. I will appreciate any help

Let $\textbf{N}$ be a random vector such that $\textbf{N} \sim \text{Multinomial}(p_{1},\dots,p_{m},n)$ and let $k<n$ such that $\{i_{1},\dots,i_{k}\}\subset \{1,2,\dots,m\}$.

Prove that $$(N_{i_{i}},\dots,N_{i_{k}},n-\sum_{j=1}^{k}N_{i_{j}})\sim \text{Multinomial}(p_{i_{1}},\dots,p_{i_{k}},1-\sum_{j=1}^{k}p_{i_{j}},n)$$

2

There are 2 best solutions below

2
On BEST ANSWER

Here's an alternative method, working directly with the joint distribution

$$\mathbb{P}\left(N_{i_1}=n_{i_1},N_{i_2}=n_{i_2},\ldots,N_{i_k}=n_{i_k},n-\sum_{l=1}^k N_{i_l}=\bar{n}\right) = \mathbb{P}\left(N_{i_1}=n_{i_1},N_{i_2}=n_{i_2},\ldots,N_{i_k}=n_{i_k}\right)$$

with $\sum_{l=1}^k n_{i_l} + \bar{n} = n$. Suppose the indices that have not been selected can be denoted $(j_1,\ldots,j_t)$ with $\sum_{l=1}^t n_{j_l} = \bar{n}$.

We can now write

$$\mathbb{P}\left(N_{i_1}=n_{i_1},N_{i_2}=n_{i_2},\ldots,N_{i_k}=n_{i_k}\right) \\ = \sum_{(n_{j_1},\ldots,n_{j_t}) \\ \text{with} \\ \sum_{l=1}^t n_{j_l} = \bar{n}}\mathbb{P}\left(N_{i_1}=n_{i_1},N_{i_2}=n_{i_2},\ldots,N_{i_k}=n_{i_k},N_{j_1}=n_{j_1},N_{j_2}=n_{j_2},\ldots,N_{j_t}=n_{j_t}\right) \\ = \sum_{(n_{j_1},\ldots,n_{j_t}) \\ \text{with} \\ \sum_{l=1}^t n_{j_l} = \bar{n}}\frac{n!}{n_{i_1}!\ldots n_{i_k}!n_{j_1}!\ldots n_{j_t}!}p_{i_1}^{n_{i_1}}\ldots p_{i_k}^{n_{i_k}}p_{i_1}^{n_{j_1}}\ldots p_{j_t}^{n_{j_t}} \\ = \frac{n!}{n_{i_1}!\ldots n_{i_k}!}p_{i_1}^{n_{i_1}}\ldots p_{i_k}^{n_{i_k}}\sum_{(n_{j_1},\ldots,n_{j_t}) \\ \text{with} \\ \sum_{l=1}^t n_{j_l} = \bar{n}}\frac{1}{n_{j_1}!\ldots n_{j_t}!}p_{i_1}^{n_{j_1}}\ldots p_{j_t}^{n_{j_t}} \\ = \frac{n!}{n_{i_1}!\ldots n_{i_k}!\bar{n}!}p_{i_1}^{n_{i_1}}\ldots p_{i_k}^{n_{i_k}}\sum_{(n_{j_1},\ldots,n_{j_t}) \\ \text{with} \\ \sum_{l=1}^t n_{j_l} = \bar{n}}\frac{\bar{n}!}{n_{j_1}!\ldots n_{j_t}!}p_{i_1}^{n_{j_1}}\ldots p_{j_t}^{n_{j_t}} $$

Now, the last sum is a multinomial sum and hence equal to $(p_{j_1} + \ldots + p_{j_t})^{\bar{n}}$ or equivalently $(1 - \sum_{l=1}^{k}p_{i_{l}})^{\bar{n}}$. And hence, this results in the sought after multinomial expression.

But you can see this method is much more tedious.

4
On

Here's a method using the moment generating function (MGF). For your starting multinomial distribution, the MGF is defined as

$$\mathbb{E}\left[\exp\left(\sum_{i = 1}^n t_i N_i)\right)\right] = \left(\sum_{i = 1}^n p_i e^{t_i}\right)^n$$

Make now the following changes to this formula: isolate the terms with indices $(i_{1},\dots,i_{k})$, set the other $t_i = t$ then you'll obtain the following formula

$$\mathbb{E}\left[\exp\left(\sum_{l = 1}^k t_{i_l} N_{i_l} + t\left(n-\sum_{l = 1}^k N_{i_l}\right)\right)\right] = \left(\sum_{l = 1}^k p_{i_l} e^{t_{i_l}} + \left(1-\sum_{l = 1}^k p_{i_l}\right) e^t \right)^n \; .$$

Now, it's just a matter of reading and interpreting what we've written. The left hand side represents the MGF of the variables

$$\left(N_{i_{i}},\dots,N_{i_{k}},n-\sum_{l=1}^{k}N_{i_{l}}\right)$$

Note that there is therefore a small mistake in the formula you give in your question.

The right hand side is the MGF of a multinomial distribution with parameters

$$\left(p_{i_{1}},\dots,p_{i_{k}},1-\sum_{l=1}^{k}p_{i_{l}},n\right)$$