Which pmf is appropriate for this set-based problem?

69 Views Asked by At

Consider a finite multiset:

$S = \{ \{a, a\}, \{\}, \{ a, c, d \}, \ldots, \{ a, t, u \}\}$

where $\forall s \in S$, $0 \leqslant \|s\| \leqslant n$, and where each multiset $s$ contains only elements from a finite set of terms: $s \subseteq \Gamma$.

For any random $s \in S$ and term $\gamma \in \Gamma$, what is the pmf which describes the probability of $\gamma$ occurring in $s$ a total of (each of) $[0,\|s\|]$ times?


Update:

Because $S$ is arbitrary (may contain multisets $s$ of any size from $[0,n]$), @Raskolnikov has suggested a counting approach using the following:

$\frac{\# \text{ of multisets containing } n \text{ ocurrences of } \gamma}{\# \text{ of multisets of cardinality at least } n} \;$

This calculates the probability of some $\gamma \in \Gamma$ occurring $n$ times in any $s \in S$ where $\|s\| \geqslant n$.