How to Prove this Multinomial Distribution Inequality

Question

How to Prove this Multinomial Distribution Inequality

1.6k Views Asked by Bumbble Comm At 10 May 2026 - 5:59

I have the following lemma, but there seems to be one (or two) mistakes in the proof found in this paper (lemma 3). The lemma states that for $Multinomial(n,p_1,\ldots,p_k)$ distributed $(X_1,\ldots,X_k)$ the following inequality holds:

$P(\sum|X_i-np_i|\geq \epsilon) \leq 3\exp(-n\epsilon^2/25)$

However, in the proof he defines some new Variables, especially a $Poisson(n)$ variable N, independent from iid $U_1,U_2,\ldots$ with $P(U_1=i)=p_i$ for $i=1,\ldots,k$.

He then defines $X_1',\ldots,X_k'$ with $X_i'$ being the number of occurences of $i$ in $U_1,\ldots,U_N$, that is $X_i'=\sum_{j=1}^N 1_{U_j=i}$. Of course they are each $Poisson(np_i)$ distributed, but then he states, that they are independent as well, but that cannot be, can it? For example, take $k=2$, then $X_1'=N-X_2'$. Sadly he uses this independence like 10 lines later.

My question is - am I missing something obvious? If not, is there another way to prove this fact or a similar (by which I mean, a different right hand $R(n,\epsilon)$ of the inequality, for which at least $\sum_{n\geq 1} R(n,\epsilon) < \infty$)

(In the first step of equation (4) of the proof seems to be a mistake as well, but maybe I have not given it enough thought yet)

Original Q&A

There are 3 best solutions below

**Bumbble Comm** · Answer 1 · 2016-10-01 01:28:55

There is no error actually. I proved this to my computer science professor. First, I need to give you major components:

[1] Switching between Poisson distribution and multinomial. Suppose X Poisson. Let $x_i$ be number of times the value i appeared (ex. $x_5$ would be the number of times 5 was observed). Also, let $\lambda$ be the total number of observations. We see that if $\lambda$ is not fixed, X is Poisson. If $\lambda$ is fixed, we can think of $p_i$ = $x_i$/$\lambda$ as the probability of each value. $p_i$'s have to add up to 1. Thus, when $\lambda$ is fixed (known in advance), X become multinomial distribution.

[2] Equation (2): Using triangle inequality. Plop $X_i^{'}$ right in the middle of the inequality like this: $$\sum_{i=1}^{k}(1/n)|X_i-np_i|$$ becomes $$\sum_{i=1}^{k}(1/n)|X_i-X_i^{'}+X_i^{'}-np_i|$$ and apply triangle inequality to get right hand side.

[3] Between (2) and (3). Let's break open the expectation. $$E(e^{t|U-\lambda|}) = \sum_{u=0}e^{t|u-\lambda|} = \sum_{u=0}^{u=\lambda}e^{t|u-\lambda|} + \sum_{u=\lambda + 1}e^{t|u-\lambda|}$$ We see that we can now get rid of absolute signs: if u is greater than $\lambda$ then remove the absolute signs. Otherwise, swap the locations. Doing this results in the following: $$= \sum_{u=0}^{u=\lambda}e^{t(\lambda-u)} + \sum_{u=\lambda + 1}e^{t(u-\lambda)}$$ We see the following situation. The left hand side of the equation in the paper are both expectations. For ours, first one only goes up to $\lambda$ (and misses the remaining infinite number of terms) and the other is missing the first lambda number of terms, and every term is strictly positive (due to being e^). So our expression, which is equivalent to the left hand side, must be strictly smaller. But the author is lazy so he puts $\le$.

The next left hand side is just using characteristic function property.

[4] The first move is Chernoff bound rewritten to fit into one line. Then, use [2]. Next, the author uses a new trick. ln(1+x) where x is really close to 0 is approximately x. Then last part, he approximates (1+x) with 2 entirely for laziness.

[5] The final step (4). He uses (2) but surprisingly, N-n appears out of nowhere. We see that he gets this by realizing that summation in left hand side is actually Expectation written out. Thus N and n are means in expectation sense. N for expectation of $X_i$ (capitalized to symbolize that N is still a random variable due to being Poisson) and n for expectation of $X_i^{'}$ (lowercase, since n is fixed for multinomial).

Then everything else is straight forward.

Actually, he uses this obscure Taylor series somewhere but I forgot where. You must use this: $$\ln (z) = \frac{(z-1)^1}{1} - \frac{(z-1)^2}{2} + \cdots$$.

Good luck!

**Bumbble Comm** · Answer 2 · 2015-04-02 21:21:04

Bumbble Comm On 02 Apr 2015 - 9:21

The independence follows from the thinning property of a Poisson process. I'm not sure about the first step in equation (4).

**Bumbble Comm** · Answer 3 · 2021-07-18 21:15:48

I know this is an old question, but thank you for posting it and for sharing the link to the AoS paper! The original poster probably does not need an answer anymore. But maybe the following could benefit future readers.

I read the proofs carefully and found it to be correct, except a small typo. On the right-hand side of the first line of equation (4), there shouldn't be a $1/n$ in the second $P(\cdot)$.

Yes, the $|N-n|$ seemingly came out of no where, but it is correct. This is due to the definition of $X_i$ and $X'_i$ in the first paragraph of the proof. Note that $X_i = \sum_{j=1}^n 1\{U_j=i\}$ and $X_i' = \sum_{j=1}^N 1\{U_j=i\}$. Thus $$ |X_i-X_i'| = \sum_{j=n+1}^N 1\{U_j=i\}1\{n<N\} + \sum_{j=N+1}^n 1\{U_j=i\}1\{n>N\} $$ Sum over $i$: \begin{align*} \sum_{i=1}^k|X_i-X_i'|& = \sum_{j=n+1}^N \sum_{i=1}^k1\{U_j=i\}1\{n<N\} + \sum_{j=N+1}^n \sum_{i=1}^k1\{U_j=i\}1\{n>N\}\\ &=\sum_{j=n+1}^N1\{n<N\} + \sum_{j=N+1}^n1\{n>N\}\\ &=|N-n|. \end{align*} Given this and after the typo correction above, the first line of equation (4) is straightforward.

That said, I still do not know how the author analytically derived the inequality $\varepsilon-(1+\varepsilon)\ln(1+\varepsilon)\leq -\varepsilon^2/(2(1+\varepsilon))$ for equation (3). But I was able to numerically verify this inequality in Matlab by which I am convinced that it is true.

How to Prove this Multinomial Distribution Inequality

There are 3 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in DISTRIBUTION-TAILS

Trending Questions

Popular # Hahtags

Popular Questions