Upper bound of difference of two discrete probability distributions

30 Views Asked by At

In the context of proving the Poisson approximation of Binomial distribution we have a lemma that derives a particular upper bound we need later on.

Let be $\Omega$ a discrete sample space and $X,Y$ two random variables. $f:\mathbb{N}\to[0,1]$ and $g:\mathbb{N}\to[0,1]$ are two probability distributions of $X$ and $Y$, respectively. In other words $P(X=k)=f(k)$ and $P(Y=k)=g(k)$.

Let be $A$ a set such that $A\subseteq\{\omega\in\Omega\mid X(\omega)=Y(\omega)\}$. Then we prove $$ \sum\limits_{k=0}^{\infty}|f(k)-g(k)|\leq 2P(A^c), $$ where $A^c:=\Omega\setminus A$.


The proof from the lecture is as follows:

Let be $M:=\{k\in\mathbb{N}\mid f(k)>g(k)\}$, then

\begin{align*} &\sum\limits_{k=0}^{\infty}|f(k)-g(k)|=\sum\limits_{k\in M}f(k)-g(k)-\sum\limits_{k\notin M}f(k)-g(k)\\ &=2\cdot \sum\limits_{k\in M}f(k)-g(k)-\sum\limits_{k=0}^{\infty}f(k)-g(k)\\ &=2\left(P(X\in M)-P(Y\in M)\right)-(1-1)\\ &=2\left(P(X\in M,A)+P(X\in M,A^c)-P(Y\in M)\right)\\ &\leq 2\left(P(X\in M,A)+P(A^c)-P(Y\in M)\right)\\ &=2\left(P(Y\in M,A)+P(A^c)-P(Y\in M)\right)\\ &\leq 2P(A^c). \end{align*}


I don't understand why $P(X\in M,A)=P(Y\in M,A)$ holds? To me it seems wrong.

For all $\omega \in\{Y\in M\}$ we know that $P(Y(\omega))=g(Y(\omega))<f(X(\omega))=P(X(\omega))$. So this is also true for the subset $\{Y\in M\}\cap A$.

1

There are 1 best solutions below

0
On BEST ANSWER

there is a typo in fourth equality (that is further propagated). You should have

$$P(X\in M) = P(X\in M, A) \color{green}{+} P(X\in M, A^c),$$

but currently there is a minus sign.

Now for your question, first note that the event inside $P(X\in M, A)$ is in fact $\{\omega\in\Omega: X(\omega)\in M\text{ and } \omega\in A\}.$ But since $\omega\in A$ implies that $X(\omega)=Y(\omega)$, then $X(\omega)\in M$ also means $Y(\omega)\in M$. Hence, you have $\{\omega\in\Omega: X(\omega)\in M\text{ and } \omega\in A\}=\{\omega\in\Omega: Y(\omega)\in M\text{ and } \omega\in A\}$, so that $P(X\in M, A)=P(Y\in M, A)$.

The idea is that you purposely make the set $A$ appear (retrieving $\omega$'s which make $X(\omega)=Y(\omega)$), so that you can get rid of the $P(Y\in M)$ term and be only left with $P(A^c)$.