Order of Growth of a Sum

343 Views Asked by At

Let $n>k$ and $r$ be arbitrary positive integers. Define $q=k/n$. I want to show that

$$ \sum_{i=0}^{rk} \binom{rn}{i}q^i(1-q)^{rn-i}(rk-i)=\Theta(\sqrt{r}) $$

as $r\rightarrow \infty$. I've plotted stuff in Mathematica and am fairly certain the claim is true for all $n$ and $k$.

The obvious approach, of dividing the LHS by $\sqrt{r}$ and trying to show the limit is a positive constant, hasn't gotten me far. Stirling's approximation might be helpful as well. Not sure. One fact (admittedly only Mathematica-verified) which might be helpful is that

$$ rk\sum_{i=0}^{rk} \binom{rn}{i}q^i(1-q)^{rn-i}=\Theta(r). $$ In other words, the "$-i$" in $(rk-i)$ is important.

Any ideas?

[In case you're curious, the hand-wavy motivation: the LHS is the loss from an algorithm which takes $rn$ samples and "accepts" the highest $rk$ of them. We're comparing our performance to a "first-best" algorithm which takes $rn$ samples and only accepts samples if they're in the top $q$-fraction of the distribution. This isn't important at all for the actual question, though.]

1

There are 1 best solutions below

2
On

You can write $$ \sum_{i=0}^{rk} \binom{rn}{i} q^i (1-q)^{rn-i} (rk-i) = \\ rk \sum_{i=0}^{rk} \binom{rn}{i} q^i (1-q)^{rn-i} - rn \sum_{i=1}^{rk} \binom{rn-1}{i-1} q^i (1-q)^{rn-i} = \\ rk \sum_{i=0}^{rk} \binom{rn}{i} q^i (1-q)^{rn-i} - rn \sum_{i=0}^{rk-1} \binom{rn-1}{i} q^{i+1} (1-q)^{rn-i-1} = \\ rk \sum_{i=0}^{rk} \binom{rn}{i} q^i (1-q)^{rn-i} - rk \sum_{i=0}^{rk-1} \binom{rn-1}{i} q^i (1-q)^{rn-1-i} = \\ rk (\Pr[\mathrm{Bin}(rn,k/n) \leq rk] - \Pr[\mathrm{Bin}(rn-1,k/n) \leq rk-1]) \approx \\ rk (\Pr[\mathrm{N}(rk,rk(1-k/n)) \leq rk] - \Pr[\mathrm{N}(rk-k/n,(rk-k/n)(1-k/n)) \leq rk-1]) \\ \approx rk \left(\frac{1}{2} - \left(\frac{1}{2}-\frac{1}{\sqrt{2\pi}}\frac{1-k/n}{\sqrt{(rk-k/n)(1-k/n)}}\right) \right) \\ = \frac{1}{\sqrt{2\pi}} \frac{rk\sqrt{1-k/n}}{\sqrt{rk-k/n}} \\ \approx \frac{1}{\sqrt{2\pi}} \sqrt{\frac{k(n-k)}{n}} \sqrt{r}. $$ Here the first approximation is the normal approximation, the second approximation is that the normal density around 0 is roughly constant, and the third is neglecting $k/n$ when $k$ is large. If all the steps can be made rigorous, then you will have proved your result (with an explicit first order approximation).