Probability of sampling at least some threshold weighted score from $k$ people at a time out of a population of $N$ people?

99 Views Asked by At

There are $N$ people. Each person has a paper with a numbered score $\in [1, T]$ where $T \geq N$.

A $Q$ percentage of people have papers with the same score, with $1 - Q$ percentage of people having papers with uniformly distributed unique scores.

Ask $k < N$ people for their papers.

We would then normalize the scores of all $k$ papers we receive with respect to the sum of all $k$ papers scores.

More explicitly, if $X$ is a column vector whose $i$'th entry is paper $i$'s score, where $i \in [1, k]$:

$$X' = \frac{X}{\sum_{i=1}^{k}{X_i}}$$

... represents the normalized $k$ papers scores.

Denote $W$ to be a column vector representative of normalized weights $\in [0, 1]$.

$$ W = \frac{X - \min_{X_i \in X}{X_i}}{\max_{X_i \in X}{X_i} - \min_{X_i \in X}{X_i}} $$

Now, the question is:

For any uniformly randomly sampled $k$ papers, what is the probability that:

$$\exists \text{score} \in X'W > \frac{2 \alpha}{\max{(2, C)}} $$

... where $C$ is the number of $k$ papers received with unique scores, and $\alpha \in [0, 1]$?