I have the following lemma, but there seems to be one (or two) mistakes in the proof found in this paper (lemma 3). The lemma states that for $Multinomial(n,p_1,\ldots,p_k)$ distributed $(X_1,\ldots,X_k)$ the following inequality holds:
$P(\sum|X_i-np_i|\geq \epsilon) \leq 3\exp(-n\epsilon^2/25)$
However, in the proof he defines some new Variables, especially a $Poisson(n)$ variable N, independent from iid $U_1,U_2,\ldots$ with $P(U_1=i)=p_i$ for $i=1,\ldots,k$.
He then defines $X_1',\ldots,X_k'$ with $X_i'$ being the number of occurences of $i$ in $U_1,\ldots,U_N$, that is $X_i'=\sum_{j=1}^N 1_{U_j=i}$. Of course they are each $Poisson(np_i)$ distributed, but then he states, that they are independent as well, but that cannot be, can it? For example, take $k=2$, then $X_1'=N-X_2'$. Sadly he uses this independence like 10 lines later.
My question is - am I missing something obvious? If not, is there another way to prove this fact or a similar (by which I mean, a different right hand $R(n,\epsilon)$ of the inequality, for which at least $\sum_{n\geq 1} R(n,\epsilon) < \infty$)
(In the first step of equation (4) of the proof seems to be a mistake as well, but maybe I have not given it enough thought yet)
There is no error actually. I proved this to my computer science professor. First, I need to give you major components:
[1] Switching between Poisson distribution and multinomial. Suppose X Poisson. Let $x_i$ be number of times the value i appeared (ex. $x_5$ would be the number of times 5 was observed). Also, let $\lambda$ be the total number of observations. We see that if $\lambda$ is not fixed, X is Poisson. If $\lambda$ is fixed, we can think of $p_i$ = $x_i$/$\lambda$ as the probability of each value. $p_i$'s have to add up to 1. Thus, when $\lambda$ is fixed (known in advance), X become multinomial distribution.
[2] Equation (2): Using triangle inequality. Plop $X_i^{'}$ right in the middle of the inequality like this: $$\sum_{i=1}^{k}(1/n)|X_i-np_i|$$ becomes $$\sum_{i=1}^{k}(1/n)|X_i-X_i^{'}+X_i^{'}-np_i|$$ and apply triangle inequality to get right hand side.
[3] Between (2) and (3). Let's break open the expectation. $$E(e^{t|U-\lambda|}) = \sum_{u=0}e^{t|u-\lambda|} = \sum_{u=0}^{u=\lambda}e^{t|u-\lambda|} + \sum_{u=\lambda + 1}e^{t|u-\lambda|}$$ We see that we can now get rid of absolute signs: if u is greater than $\lambda$ then remove the absolute signs. Otherwise, swap the locations. Doing this results in the following: $$= \sum_{u=0}^{u=\lambda}e^{t(\lambda-u)} + \sum_{u=\lambda + 1}e^{t(u-\lambda)}$$ We see the following situation. The left hand side of the equation in the paper are both expectations. For ours, first one only goes up to $\lambda$ (and misses the remaining infinite number of terms) and the other is missing the first lambda number of terms, and every term is strictly positive (due to being e^). So our expression, which is equivalent to the left hand side, must be strictly smaller. But the author is lazy so he puts $\le$.
The next left hand side is just using characteristic function property.
[4] The first move is Chernoff bound rewritten to fit into one line. Then, use [2]. Next, the author uses a new trick. ln(1+x) where x is really close to 0 is approximately x. Then last part, he approximates (1+x) with 2 entirely for laziness.
[5] The final step (4). He uses (2) but surprisingly, N-n appears out of nowhere. We see that he gets this by realizing that summation in left hand side is actually Expectation written out. Thus N and n are means in expectation sense. N for expectation of $X_i$ (capitalized to symbolize that N is still a random variable due to being Poisson) and n for expectation of $X_i^{'}$ (lowercase, since n is fixed for multinomial).
Then everything else is straight forward.
Actually, he uses this obscure Taylor series somewhere but I forgot where. You must use this: $$\ln (z) = \frac{(z-1)^1}{1} - \frac{(z-1)^2}{2} + \cdots$$.
Good luck!