Expected number of rejected null hypotheses using FDR

397 Views Asked by At

Problem:

Let $X_1, X_2, \dots, X_{500}$ be independently identically distributed. For a constant $a$, suppose we know the probability $$P(X_i\leq k*a)\ \forall k = 1,\dots , 500.$$

We now sort the $X$'s so that $X_{(1)}\leq X_{(2)}\leq \dots\leq X_{(500)}$.

Find the expectation $E(K)$, $K = \max(k)$ where $k$ satisfies $X_{(k)}\leq k*a.$

My attempt:

I don't really have any rigorous proof, but I simply think of this as an "average" problem, and so the answer would be $$E(K) = \sum_{k=1}^{500}P(X_i\leq k*a).$$

Note: You may notice that this problem relates to false discovery rate. I have been banging my head against the wall for couple days and can't seem to get anywhere. Any help/suggestions/ideas are much appreciated!

1

There are 1 best solutions below

4
On BEST ANSWER

Write $$ \mathbb{E}[K] = \sum_{k\ge 1} \mathbb{P}(K\ge k) = \sum_{k= 1}^n \mathbb{P}(\text{at least $k$ variables do not exceed $ka$}) $$ Note that the number of variables not exceeding $ka$ has a binomial distribution with parameters $n$ and $p_{k} = \mathbb{P}(X_1\le ka)$. Therefore, $$ \mathbb{E}[K]= \sum_{k= 1}^n \sum_{j=k}^n {n\choose j}p_k^j (1-p_k)^{n-j}. $$ It is unlikely that there is a simpler expression for $\mathbb{E}[K]$. For some particular distributions it might be possible to derive asymptotics. Say, if $\mathbb{P}(|X_1|>x) = o(x^{-1}\log^{-1} x)$, $x\to\infty$, then $K=n$ with overwhelming probability, so $\mathbb{E}[K] \sim n$, $n\to\infty$.