I tried myself on a proof that shows that the multinomial distribution $\mathcal{M}_{n,\rho}$, defined as: $$ \mathcal{M}_{n,\rho}:\left(\Omega=\{\left(a_1,\ldots,a_k\right)\in\mathbb{N}_0^k\mid a_1+\ldots+a_k=n\},P\left(\Omega\right)\right)\rightarrow\left[0,1\right],\qquad\mathcal{M}_{n,\rho}\left(\{\left(a_1,\ldots,a_k\right)\}\right)=\binom{n}{a_1,\ldots,a_k}\prod_{i=1}^{k}{p_i^{a_i}} $$ (with $\rho$ is defined as $\rho:= (\rho_1,...,\rho_k)$)
can be obtained by convoluting $n$ instances of the multinomial distribution $\mathcal{M}_{1,\rho}$.
However, while the proof reaches the conclusion, I'm quite certain that there are a few aspects in it that make use of misconceptions I have about random variables, so I'm looking for the mistakes I've made, and why it is wrong to use those concepts that way.
For $i\in\{1,..,n\}$ define the independent random variables $$Y_i:\left(\Omega=\mathbb{N}_0^k,\mathcal{F}=P\left(\Omega\right),\mathbb{P}=\mathcal{M}_{1,\rho}\right)\rightarrow\mathbb{N},\quad\quad Y_i\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)≔\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right) $$ We want to show the following: $$\mathbb{P}\left(\sum_{i=1}^{n}Y_i=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right)=\mathcal{M}_{n,\rho}\left(\{\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\}\right) $$ The proof: $$ \mathbb{P}\left(\sum_{i=1}^{n}Y_i=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right)=\mathbb{P}\left(\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right) $$
In the sum $\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)$ every vector is a unit vector. We now look at which combinations of the vectors can create the right hand side.
Let $f:\left[n\right]\rightarrow\left[k\right]$, so that $f\left(i\right)$ tells us that the $i$-th summand of $\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)$ is the $f\left(i\right)$-th unit vector. Then the equality
$$\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)=\left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right) $$ is true iff $\forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i$.
If further two functions of type $f$ are distinct, the vector-sets
$$ \left(\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right),\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)\right) $$
they create are distinct as well.
We therefore obtain: $$\mathbb{P}\left(\sum_{i=1}^{n}\left(\begin{matrix}a_{1,i}\\\vdots\\a_{k,i}\\\end{matrix}\right)= \left(\begin{matrix}a_1\\\vdots\\a_k\\\end{matrix}\right)\right) $$
$$ =\bigcup_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \{\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right)=e^{\left(f\left(1\right)\right)},\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)=e^{\left(f\left(n\right)\right)}\} \\ = \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \mathbb{P}\left(\{\left(\begin{matrix}a_{1,1}\\\vdots\\a_{k,1}\\\end{matrix}\right)=e^{\left(f\left(1\right)\right)},\ldots,\left(\begin{matrix}a_{1,n}\\\vdots\\a_{k,n}\\\end{matrix}\right)=e^{\left(f\left(n\right)\right)}\}\right) \\ = \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \mathbb{P} \left(\{Y_1=e^{\left(f\left(1\right)\right)},\ldots,Y_n=e^{\left(f\left(n\right)\right)}\}\right) $$ We now use that the $Y_i$ are independent: $$ =\sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}\mathbb{P}\left(Y_i=e^{\left(f\left(i\right)\right)}\right) $$ We replace all $Y_i$ by $Y_1$ and use that because of the condition $\forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i$ we know that those $a_i$ factors of the product all each create the unit vector $e^{\left(i\right)}$ :
$$ \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}\mathbb{P}\left(Y_1=e^{\left(f\left(i\right)\right)}\right) \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} \prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}} =\left(\prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}}\right) \sum_{f:\left[n\right]\rightarrow\left[k\right] \\ \forall i\in\left[k\right]:\left|f^{-1}\left(i\right)\right|=a_i} 1 $$ It remains to calculate the number of summands of the sum.This number is equivalent ot the number of words with of length $n$ with $a_1$ ones, …, and $a_k$ $k$’s.
$$ =\left(\prod_{i=1}^{n}{\mathbb{P}\left(Y_i=e^{\left(i\right)}\right)^{a_i}}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\left(\prod_{i=1}^{n}{\mathcal{M}_{1,\rho}\left(e^{\left(i\right)}\right)^{a_i}}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\left(\prod_{i=1}^{n}{\rho\left(E_i\right)}^{a_i}\right)\cdot\binom{n}{a_1,\ldots,a_k}=\mathcal{M}_{1,\rho}\left(\{a_1,\ldots,a_k\}\right) $$
As we wanted to show.