The following problem came into my mind when I am studying the Polya Urn Model.
At the beginning, from a bin containing $c_1$ balls labeled $1$, $c_2$ balls labeled $2$, … , $c_m$ balls labeled $m$, a ball is drawn at random ( $c_i>0, \, 1\leqslant i\leqslant m$, $m>2$ ). Its label is noted and the ball is put back to the bin along with addition $r>0$ balls with the same label. Then always starting with the newly constituted bin, the experiment is continued $n\geqslant 0$ times. Let $B_1$, $B_2$, … , $B_m$ denote the numbers of balls of different labels after $n$ draws. What is the joint probability distribution $$f_n(B_1=c_1+r\cdot k_1, B_2=c_2+r\cdot k_2,..., B_{m}=c_m+r\cdot(n-\sum\nolimits_{i=1}^{m-1}k_i)),$$
where $k_i$ is a nonnegative integer?
I have found that $$f_0(B_1=c_1, B_2=c_2,..., B_{m}=c_m)=1,$$ and $$f_1(B_1=c_1, B_2=c_2…, B_i=c_i+r,..., B_{m}=c_m)=\frac{c_i}{\sum\nolimits_{j=1}^{m}c_j}$$ for $1\leqslant i\leqslant m$.
But this recursive formula gets very complicate when $n>2$. Is there any easier way to give this distribution?
I feel like this problem has been solved, but I’m having trouble finding sources. Any help or references would be greatly appreciated!
Let $K_i$ be the number of times balls with label $i$ that are drawn, and let $K = (K_1, \ldots, K_m)$.
Let $L_j$ be the label of the ball drawn at the $j$-th step, and let $L = (L_1, \ldots, L_n)$.
We are interested in finding $P(K = k)$ for a given sequence $k = (k_1, \ldots, k_m)$.
If we know $L = \ell$, then $K$ is completely determined. ($K = k$ with $k_i$ being the number of occurrences of $i$ in $\ell$.)
Given $K = k$, a sequence $L$ that may occur must have exactly $k_i$ occurrences of $i$. Let $\mathcal L_k$ be the collection of all sequences satisfying this. All members of $\mathcal L_k$ are permutations of each other.
I claim that for any $\ell \in \mathcal L_k$, then $P(L = \ell)$ is a constant.
Let's prove the simplest case when $\ell'$ is obtained from $\ell$ by permuting an adjacent pair of numbers. Suppose $p \in \{1, 2, \ldots, n - 1\}$ and $\ell'$ is obtained from $\ell$ by permuting the $p$-th and the $(p+1)$-th elements.
If $\ell_p = \ell_{p+1}$, we don't need to do anything because $\ell' = \ell$.
Otherwise, we compute $P(L = \ell)$ by \begin{align} P(L = \ell) = & \ P\left(L_{[p+2, n]} = \ell_{[p+2,n]} \mid L_{[1,p+1]} = \ell_{[1,p+1]}\right) \cdot {} \\ & P\left(L_{[p,p+1]} = \ell_{[p,p+1]} \mid L_{[1,p-1]} = \ell_{[1,p-1]}\right) \cdot {} \\ & P\left(L_{[1,p-1]} = \ell_{[1,p-1]}\right), \end{align} where $L_{[a, b]}$ refers to the sub-vector $(L_a, L_{a+1}, \ldots, L_b)$ of $L$, and similarly for $\ell$.
For $P(L = \ell')$, we get the same formula with $\ell$ replaced by $\ell'$: \begin{align} P(L = \ell') = & \ P\left(L_{[p+2, n]} = \ell'_{[p+2,n]} \mid L_{[1,p+1]} = \ell'_{[1,p+1]}\right) \cdot {} \\ & P\left(L_{[p,p+1]} = \ell'_{[p,p+1]} \mid L_{[1,p-1]} = \ell'_{[1,p-1]}\right) \cdot {} \\ & P\left(L_{[1,p-1]} = \ell'_{[1,p-1]}\right). \end{align} We hope that the two probabilities are equal. Since $\ell$ and $\ell'$ differ only at $p$ and $p+1$, conditioning on $L_{[1,p-1]}$ and $L_{[1,p+1]}$ will have the same effect whether we choose $\ell$ or $\ell'$. It comes down to proving
$$P\left(L_{[p,p+1]} = \ell_{[p,p+1]} \mid L_{[1,p-1]} = \ell_{[1,p-1]}\right) = P\left(L_{[p,p+1]} = \ell'_{[p,p+1]} \mid L_{[1,p-1]} = \ell_{[1,p-1]}\right).$$
Another simplification that can be done here is that these probabilities can be made unconditional if we just re-define the problem differently. (Namely, start after $p - 1$ drawings have been done.) We may now assume that $p = 1$. We can also assume $\ell_1 = 1$ and $\ell_2 = 2$ without loss of generality. The equality in question becomes
$$ P\left(L_1 = 1, L_2 = 2\right) \stackrel{?}{=} P\left(L_1 = 2, L_2 = 1\right). $$
This can be verified directly: (let $C = \sum_{i=1}^m c_i$) \begin{align} P\left(L_1 = 1, L_2 = 2\right) & = P\left(L_1 = 1\right) \cdot P\left(L_2 = 2 \mid L_1 = 1\right) \\ & = \frac{c_1}{C} \cdot \frac{c_2}{C + r} \\ & = \frac{c_2}{C} \cdot \frac{c_1}{C + r} \\ & = P\left(L_1 = 2\right) \cdot P\left(L_2 = 1 \mid L_1 = 2\right) \\ & = P\left(L_1 = 2, L_2 = 1\right). \end{align}
Since every permutation can be created from permuting two adjacent elements multiple times (think bubblesort), we have proved that $P(L = \ell) = P(L = \ell')$ whenever $\ell'$ is a permuted sequence of $\ell$. The questions that remain are 1) determining the number of elements of $\mathcal L_k$, and 2) determining the probability of $P(L = \ell)$ for each $\ell \in \mathcal L_k$.
The first question is a basic counting question. The cardinality of $\mathcal L_k$ is $$ \frac{n!}{k_1! k_2! \ldots k_m!} $$
The second question is also quite easy to answer. We can pick a specific ordering of $\ell$ that is easy for calculating $P(L = \ell)$, and we know the result will be the same for all other permutations. For example, we can pick the event where $1$ is drawn $k_1$ times, then $2$ is drawn $k_2$ times, and so on. (This corresponds to $\ell_1 = \ldots = \ell_{k_1} = 1$, $\ell_{k_1 + 1} = \ell_{k_1 + 2} = \ldots = \ell_{k_1 + k_2} = 2$, and so on.) The probability is \begin{align} P(L = \ell) & = \frac{\prod_{i=1}^m \prod_{j=0}^{k_i - 1} (c_i + jr)}{\prod_{j=0}^{n-1} (C + jr)}. \end{align} (I skipped some algebraic detail here as it's very tedious and not at all illuminating. It turns out that this expression depends only on $k$ and not on $\ell$!)
Finally, $$ P(K = k) = \sum_{\ell \in \mathcal L_k} P(L = \ell) = \frac{n!}{\prod_{i=1}^m k_i!} \frac{\prod_{i=1}^m \prod_{j=0}^{k_i - 1} (c_i + jr)}{\prod_{j=0}^{n-1} (C + jr)}. $$