A question about close votes (please read it before migrating it to meta)

64 Views Asked by At

Suppose, on close.stackexchange.com there are $m$ active users with sufficient reputation to cast close votes and $n$ open (not closed) questions (the number of open questions never changes - whenever a question is closed, a new one is posted, no new questions are posted otherwise and no questions are reopened). Every day, each of the active users reviews (and choses, whether to cast a close vote or not) $r$ randomly chosen distinct open questions (with uniform distribution). All close votes are cast from review. Due to a bug on close.stackexchange.com, a user can cast a close vote any time they find that question in their review queue (which means the same user can cast multiple close votes on the same question on separate days). $t$ close votes are needed to actually close a question. Close votes on an open question age away in $k$ days.

Suppose, that a question was posted, such that the probability of it receiving a close vote during any review is $p$ (all reviews are independent and identically distributed, even if done by the same user). What is the probability, that this question will ever be closed?

It is not hard to see, that under those conditions the number of close votes received on a single day is independent and identically distributed for every day, and its distribution is binomial: $Bin(m, \frac{C_{n-1}^{r-1}}{C_n^r}p)$

Thus the problem is reduced to the following one:

Suppose $\{X_i\}_{i = 1}^\infty$ are i.i.d. random variables, distributed as $Bin(m, \frac{C_{n-1}^{r-1}}{C_n^r}p)$. What is the probability that $\exists N \in \mathbb{N}$, such that $\Sigma_{i=N}^{N+k-1} X_i \geq t$?

And here I am stuck.

1

There are 1 best solutions below

0
On BEST ANSWER

The probability is $1$ if $t \leq mk$ and $p > 0$, and $0$ otherwise.

If $mk < t$ then question will not ever be closed even if every user will vote to close it every day. If $p = 0$ then no close votes will ever be casted.

Otherwise let $Y_n = \sum\limits_{i=nk}^{nk + k - 1} X_i$. $Y_n$ are i.i.d., $P(Y_n < t) = a < 1$ and $P(\exists N\in \mathbb N\colon \sum_{i=N}^{N+k-1} X_i \geq t) \leq P(\exists N \in \mathbb N \colon Y_N \geq t)$.

$P(\exists N \in \mathbb N \colon Y_N \geq t) = 1 - P(\forall N \in \mathbb N\colon Y_N < t)$. For any $n$, $P(\forall N \in \mathbb N\colon Y_N < t) \leq P(Y_1 < t \wedge Y_2 < t \wedge \ldots \wedge Y_n < t) = P(Y_1 < t) \cdot \ldots \cdot P(Y_n < t) = a^n$.

As $a < 1$, $\lim_{n \to \infty} a^n = 0$, so $P(\forall N \in \mathbb N\colon Y_N < t) = 0$.