Proof-Verification - Which concepts in stochastics did I mistake in my proof about the convolution of the multinomial distributon?

20 Views Asked by Bumbble Comm At 09 Apr 2026 - 6:07

I tried myself on a proof that shows that the multinomial distribution $\mathcal{M}_{n,\rho}$, defined as: $$ \mathcal{M}_{n,\rho}:\left(\Omega=\{\left(a_1,\ldots,a_k\right)\in\mathbb{N}_0^k\mid a_1+\ldots+a_k=n\},P\left(\Omega\right)\right)\rightarrow\left[0,1\right],\qquad\mathcal{M}_{n,\rho}\left(\{\left(a_1,\ldots,a_k\right)\}\right)=\binom{n}{a_1,\ldots,a_k}\prod_{i=1}^{k}{p_i^{a_i}} $$ (with $\rho$ is defined as $\rho:= (\rho_1,...,\rho_k)$)

can be obtained by convoluting $n$ instances of the multinomial distribution $\mathcal{M}_{1,\rho}$.

However, while the proof reaches the conclusion, I'm quite certain that there are a few aspects in it that make use of misconceptions I have about random variables, so I'm looking for the mistakes I've made, and why it is wrong to use those concepts that way.

For $i\in\{1,..,n\}$ define the independent random variables $$Y_i:\left(\Omega=\mathbb{N}_0^k,\mathcal{F}=P\left(\Omega\right),\mathbb{P}=\mathcal{M}_{1,\rho}\right)\rightarrow\mathbb{N},\quad\quad Y_i\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)≔\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right) $$ We want to show the following: $$\mathbb{P}\left(\sum_{i=1}^{n}Y_i=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right)=\mathcal{M}_{n,\rho}\left(\{\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\}\right) $$ The proof: $$ \mathbb{P}\left(\sum_{i=1}^{n}Y_i=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right)=\mathbb{P}\left(\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right) $$

In the sum $\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)$ every vector is a unit vector. We now look at which combinations of the vectors can create the right hand side.

Let $f:\left[n\right]\rightarrow\left[k\right]$, so that $f\left(i\right)$ tells us that the $i$-th summand of $\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)$ is the $f\left(i\right)$-th unit vector. Then the equality

$$\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right) $$ is true iff $\forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i$.

If further two functions of type $f$ are distinct, the vector-sets

$$ \left(\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right),\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)\right) $$

they create are distinct as well.

We therefore obtain: $$\mathbb{P}\left(\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)= \left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right) $$

$$ =\bigcup_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \{\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right)=e^{\left(f\left(1\right)\right)},\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)=e^{\left(f\left(n\right)\right)}\} \\ = \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \mathbb{P}\left(\{\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right)=e^{\left(f\left(1\right)\right)},\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)=e^{\left(f\left(n\right)\right)}\}\right) \\ = \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \mathbb{P} \left(\{Y_1=e^{\left(f\left(1\right)\right)},\ldots,Y_n=e^{\left(f\left(n\right)\right)}\}\right) $$ We now use that the $Y_i$ are independent: $$ =\sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}\mathbb{P}\left(Y_i=e^{\left(f\left(i\right)\right)}\right) $$ We replace all $Y_i$ by $Y_1$ and use that because of the condition $\forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i$ we know that those $a_i$ factors of the product all each create the unit vector $e^{\left(i\right)}$ :

$$ \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}\mathbb{P}\left(Y_1=e^{\left(f\left(i\right)\right)}\right) \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}} =\left(\prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}}\right) \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} 1 $$ It remains to calculate the number of summands of the sum.This number is equivalent ot the number of words with of length $n$ with $a_1$ ones, …, and $a_k$ $k$’s.

$$ =\left(\prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\left(\prod_{i=1}^{n}{\mathcal{M}_{1,\rho}\left(e^{\left(i\right)}\right)^{a_i}}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\left(\prod_{i=1}^{n}{\rho\left(E_i\right)}^{a_i}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\mathcal{M}_{1,\rho}\left(\{a_1,\ldots,a_k\}\right) $$

As we wanted to show.

Original Q&A

Proof-Verification - Which concepts in stochastics did I mistake in my proof about the convolution of the multinomial distributon?

Related Questions in PROBABILITY

Related Questions in RANDOM-VARIABLES

Related Questions in SOLUTION-VERIFICATION

Related Questions in MULTINOMIAL-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions