I have a problem which involves the standard coupon collector's problem to find a probability density from the generating convolution. I start by defining the problem and a few basic statistics. Let the number of unique coupons be $N$, numbered from $1$ to $N$. Our goal is to collect each unique coupon at least once. I want to find the probability distribution of $T=\sum\limits_{i=1}^R X_i$ which is a random sum.
Define $R$ to be the number of draws until we collect at least 1 of each unique coupon. This is itself a sum of N independent Geometric random variables $Z_i$, with $R=\sum\limits_{i=1}^N Z_i$.
By inclusion-exclusion principles and a little algebra, it is trivial to show $R$ has a probability distribution: $$\mathbb P(R=r)= \frac{N!}{N^r} \times {r-1 \brace N-1}. \quad R\ge N$$
Since $R$ is a sum of independent geometric random variables, it can be shown the probability generating function $g_R(t)$ is $$\mathbb E(t^R) = \prod_{i=1}^N \frac{N - i + 1}{N - (i - 1)t}. \quad |t| \lt \frac{N}{N - 1}$$
Define $X_i$ to be the value of the drawn coupon on each of the independent draws. They are i.i.d. with probability distribution $\mathbb P(X_i=x)= \dfrac 1N$, i.e. $X_i$ has a discrete uniform distribution with $ 1\le X_i \le N$. Therefore the probability generating function $\space g_{X_{i}}(t)$ is $$\mathbb E(t^{X_i}) = \frac {t \left({1 - t^N}\right)} {N \left({1 - t}\right)}. \quad |t| \lt 1$$
Therefore the random sum $T=\sum\limits_{i=1}^R X_i$ should have a convoluted probability generating function $$g_R(g_{X_{i}}(t))= \prod_{i=1}^N \frac{N - i + 1}{N - (i - 1) \times \dfrac {t \left({1 - t^N}\right)} {N \left({1 - t}\right)}}, \quad |t| \lt 1$$ which simplifies to $$\prod_{i=1}^N \frac{N(1-t)(N - i + 1)}{N^2(1-t) - (i - 1) {t \left({1 - t^N}\right)}}.$$
I am as new as you can get to this site so sorry if I buggered something up! Never worked with latex so feel free to change any errors I made. Hopefully this is a fulfilling and interesting question and someone can find a naughty method. Given what I am doing is right (oops if I made a silly error and the answer is cat), if anyone can help with this fun looking convolution (I have chucked a few things at it and resulted in poop) it would be awesome! I have also tried it from first principles through computing $\mathbb P(T=t)$ using combinatorics and compositions but it seems I would need to read up a bit about A-restricted compositions, mostly of which I am having trouble getting my head around. Then again, I am stuck here too. Please help. Much appreciated. :)
Links below are just formality for those unfamiliar with the base problem.
The standard coupon collector's problem as can be found here.
The term in the braces for the distribution of $R$ is the Stirling number of the second kind.
Edit 1
Having thought about it for a few days, it seems this problem MAY be able to be done by the law of total probability. However I am still restricted by conditions on the collection, i.e. the number of each coupon collected should be at least $1$ to successfully complete the total. Let me explain:
$$\mathbb P(T=t)= \sum_i \mathbb P(T=t \mid R=i) \mathbb P(R=i),$$ where
$$i \in \left[ \bigg \lceil {\frac {T- \frac {N(N+1)}2}N} \bigg \rceil +N,N+ T-\frac {N(N+1)}2 \right]$$
So all we need is away to get $\mathbb P(T=t \mid R=i)$ for all $i$. And I know this can be achieved by finding: (1) How many ways in exactly $i-1$ steps we can get a total of $T-j$ for each for each $j \in [1,N]$, and divide it by: (2) the number of ways to finish in $i$ such that each number is rolled at least once not before collection step $i$.
Lets think about the case $N=6$ (a fir 6-sided dice) and $T=22$ It is clear our limits for $i$ are from $[\max(6,1)+1,6+1]=[7,7]$, which this implies the only way to achieve total of $22$ where we roll every number from $1 \to 6$ at least once is in exactly $7$ rolls. Hopefully we can see that is the case intuitively.
So (1) is calculated by all the permutations of $22-j$ without $j$ for each $j \in [1,6]$ and (2) is calculated by all the permutations of finishing in $R=i$ goes, independent of finishing on a total of $22$.
My issue is then calculating (1) and (to some extent without high knowledge of A-restricted compositions) (2)!
Edit 2
I found a closed form for the simplest case N=2. Which is just a coin. With one side labeled a $1$. The other having a $2$.
$X_i$ is discrete uniform with $P(X_i=1)=1-P(X_i=2)=\frac 12$. I want to get $P(T=t)$ for $T \ge3$.
Let $T=3$. By counting, we can roll a total of 3 by: $$(1,2) \ \ OR \ \ (2,1)$$ Either event has probability $\left( \frac 12 \right)^2$. So we have $P(T=3)=2\left( \frac 12 \right)^2=\left( \frac 12 \right)$
Let $T=4$. By counting, we can roll a total of 4 by: $$(1,1,2)$$ This event has probability $\left( \frac 12 \right)^3$. So we have $P(T=4)=\left( \frac 18 \right)$
Here we do not count the event $(2,1,1)$ as a total of $4$ since by the second roll we have successfully achieved $D$. hence the actual total would be $3$ and we ignore the final $1$.
Now, let $T=5$. By counting, we can roll a total of 5 by: $$(1,1,1,2) \ \ OR \ \ (2,2,1)$$ The first event has probability $\left( \frac 12 \right)^4$. The second event has probability $\left( \frac 12 \right)^3$ So we have $P(T=5)=\left( \frac 12 \right)^4+ \left( \frac 12 \right)^3$
If we continue in this fashion, we will always have exactly 2 ways of rolling a total of $T=t$. One with probability to the power of $T-\sum_{i=1}^2i=T-3$ and the other with probability to the power of $\frac {T-3}2$ when T is odd and $0$ if even. We can hence write down the closed form for the simplest case where $N=2$ as: $$P(T=t|N=2)=\frac 14 \left(\left(\frac 12 \right)^{t-3}+\Bbb I_{t \ is \ odd} \left(\frac 12 \right)^{\frac{t-3}2} \right)$$
But of course this still leaves me with not knowing $N=3,4,5,...$
Edit 3
As well noted in the comments, the basic statistics for this problem follow from the definitions of $R$ and $X_i$. Firstly: $$\Bbb E(T)=\Bbb E(\sum\limits_{i=1}^R X_i)$$.
It is known that: $$\Bbb E(R)=N\ \sum\limits_{i=1}^N E(Z_i)=NH_N$$
for $H_n=\sum\limits_{i=1}^N \frac 1i$ the harmonic number.
Since $X_i$ is uniform: $$\Bbb E(X_i)=\frac {N+1}2 \ \ \forall i$$
Putting it all together using Wald's identity we get: $$\Bbb E(T)=\Bbb E(R)\Bbb E(X_i)=N\sum\limits_{i=1}^N \frac 1i \frac {N+1}2$$ $$=\frac 12 N(N+1) \sum\limits_{i=1}^N \frac 1i$$
Next lets consider the variance: $$\Bbb V(T)=\Bbb V(\sum\limits_{i=1}^R X_i)$$
It is also known that (from properties of i.i.d. geometric random variables): $$\Bbb V(R)=N\sum\limits_{i=1}^N \frac {i-1}{(N-i+1)^2}$$
Since $X_i$ is uniform: $$\Bbb V(X_i)=\frac {N^2-1}{12} \ \ \forall i$$
Now using the identity for the variance of a random sum via law of iterated variance: $$\Bbb V(T)=\Bbb E(R) \Bbb V(X_i) + [\Bbb E(X_i)]^2 \Bbb V(R)$$ $$=N \sum\limits_{i=1}^N \frac 1i \frac {N^2-1}{12}+\frac {(N+1)^2}4 \sum\limits_{i=1}^N \frac {N(i-1)}{(N-i+1)^2}$$ $$=\frac {N(N+1)(N-1)}{12} \sum\limits_{i=1}^N \frac 1i+\frac {N(N+1)(N+1)}4 \sum\limits_{i=1}^N \frac {i-1}{(N-i+1)^2}$$ $$=\frac {N(N+1)}{12} \sum\limits_{i=1}^N \left[ \frac {N-1}i + \frac {3(N+1)(i-1)}{(N-i+1)^2} \right]$$
So for example when $N=6$, i.e. a die: $$\Bbb E(T)=\frac 12 6(6+1) \sum\limits_{i=1}^6 \frac 1i=51.45$$ $$\Bbb V(T)=\frac {6(7)}{12} \sum\limits_{i=1}^6 \left[ \frac {5}i + \frac {3(7)(i-1)}{(6-i+1)^2} \right]=\frac {42}{12} 148.715= 520.5025$$
This makes sense. On average we roll $14.7$ times and get $3.5$ on each roll. $14.7 \cdot 3.5=51.45$.
In fact, a simulation of $10$ million collections shows that these are the correct values:
> mean(Test$Roll_Total)
[1] 51.45751
> var(Test$Roll_Total)
[1] 520.5317
And the following diagram shows the distribution for the total $T$:
Aside: I have been trying to read up on generating functions for compositions and I cannot quite find what I am looking for in terms of restrictions. If someone finds one please let me know!
Help / hints appreciated :)
