Total variation/$\ell_1$ distance between $\operatorname{Binomial}(n,p)$ and $\operatorname{Binomial}(n+m, p)$.

1.4k Views Asked by At

Let $p\in(0,1)$ be any constant (if that helps or is necessary, $p=1/2$ is what I care the most about), and fix $C > 0$.

What can we say about the total variation distance between $\operatorname{Binomial}(n,p)$ and $\operatorname{Binomial}(n+C\lfloor\sqrt{n}\rfloor, p)$? That is, how does the quantity $$ \delta_{p,C}(n)\stackrel{\rm def}{=}\frac{1}{2}\sum_{k=0}^{n+C\lfloor\sqrt{n}\rfloor}\left\lvert \binom{n}{k} p^k(1-p)^{n-k}-\binom{n+C\lfloor\sqrt{n}\rfloor}{k} p^k(1-p)^{n+C\lfloor\sqrt{n}\rfloor-k} \right\rvert $$ behaves, when $n\to\infty$?

From simulations, it seems to converge to some value $\ell=\ell(p,C)\in(0,1)$; which I suppose would make sense "in hindsight" as the mode is shifted by roughly $\sqrt{n}$, while a constant fraction of the probability weight is roughly uniformly spread in an interval of size $\approx\sqrt{n}$ centered around the mode.

However, I do not trust my coding skills enough to be convinced by simulations, and in any case experiments are no proof. And trying to expand and analyze $\delta_{p,C}$ by massaging or poking the expression above just has lead, so far, to cumbersome and ugly-looking expressions.

Is there a simple (hopefully not brute-force-computation) argument to study the asymptotic behavior of $\delta_{p,C}$ when $n\to\infty$?

(if this helps in a first time, $C=1$ is fine, and restricting to $n=k^2$ for some $k$ to avoid messy floors is definitely OK).

1

There are 1 best solutions below

0
On BEST ANSWER

Closing this question: I will not give a full answer, but basically there is at least a sort-of-straightforward approach to give rather good bounds on this quantity. Namely,

  • first to replace the two Binomials by Gaussians with the corresponding "good" parameters (calling in either the Central Limit Theorem or the more quantitative Berry—Esseen theorem to bound the loss in doing so);
  • using known bounds on the distance between two Gaussians (either directly, or using tight-ish bounds from Hellinger or Kullback-Leibler, for the latter using Pinksner's inequality).

Another option is to use known approximations of the tails of the Binomials (for their CDF's $F_1,F_2$), and simply see what value of $x$ maximizes (the approximation of) $\lvert F_1(x)-F_2(x)\rvert$.

Finally, a third option (messier) to get a lower bound is to tackle the quantity above by restricting the sum to the "indices that matter," that is within a constant factor of standard deviations from the means (as the Binomials are really tighly concentrated); and then to lower bound the remaining summands by some ather horrendous Taylor approximation of the binomial coefficients (writing $k=\alpha\sqrt{n}$ for say $\alpha\in[-1/2,1/2]$), since that's the range of indices that are left)