"Shady" witness for total variation distance

66 Views Asked by At

Intro: Hello! So I am wondering about the following problem. I have three discrete probability distributions (think $d$-dimensional $\ell_1$-normalized vectors with only non-negative entries), namely $P,P'$ and $Q$. I know $P'$ and $Q$ (i.e. have an exact description) and know about the existence of $P$ such that $$TV(P,P')=\frac12\|P-P'\|_1=\epsilon\quad\text{and}\quad TV(P,Q)=\frac12\|P-Q\|_1=\epsilon+\tau$$ for some arbitrary but given $\epsilon,\tau>0$. Clearly, by triangle also $P'$ and $Q$ are at least $\tau$ apart in TV distance. Now I wonder, whether given the above, in particular the knowledge about $P'$ and $Q$, can I construct an event $\mathcal E$ that "witnesses" the TV distance between $P$ and $Q$, in the sense that $$\left\lvert P(\mathcal E) - Q(\mathcal E) \right\rvert \geq \tau\quad ?$$ I believe that $\tau$ should appear on the RHS (rather than $\epsilon+\tau$) as I am only given info about $P'$ and $Q$ of which I only know "for sure" that they are at least $\tau$ separated. Hence, the remaining $\epsilon$ could be distributed in the plane orthogonal to $\vec\xi=Q-P'$ about which I know nothing.

The intuition that I should still be able to whitness atleast the $\tau$ is explained at the bottom. Below I formalize the problem a bit along the lines I was thinking so far. In particular, I was trying to think about the differences between rather than the actual distributions themselves ($\vec\epsilon'=P'-P$ and $\vec\epsilon+\vec\tau=Q-P$).

Setting: Assume $\epsilon,\tau>0$ some integer $d$ and (abusing notation) $d$-dimensional vectors $\vec\epsilon, \vec\epsilon', \vec\tau$ such that $$\|\vec\epsilon\|_1=\|\vec\epsilon'\|_1=2\epsilon \quad\text{and}\quad \|\vec\tau\|_1=2\tau$$ together with $$\sum_i\vec\epsilon_i=\sum_i\vec\epsilon_i'=\sum_i\vec\tau_i=0$$ Further define $$\vec\xi=(\vec\epsilon+\vec\tau)-\vec\epsilon'$$ where the braces should emphasize the interpretation atop.

Question(s): So here are my two concrete approaches to construct such a whitness. So far I could not disporove them but also did not yet succeed in proving the following claims:

  1. Let $\Phi$ be the vector such that ${\Phi}_i=1$ if $\xi_i>0$ and $\Phi_i=-1$ if $\xi_i\leq0$. Then, does it hold that $\Phi^T\cdot(\vec\epsilon+\vec\tau)\geq2\tau$?
  2. Let $\Phi'$ be the vector such that ${\Phi}_i=1$ if $\xi_i>0$ and $0$ elsewhere. Then, does it hold that $\Phi'^T\cdot(\vec\epsilon+\vec\tau)\geq\tau$?

Intuition: While trying to disproof my gut feeling I tried to construct an explicit counter example I observed the following: If for a moment we forget that we are dealing with distributions and focus only on the $\ell_1$ distance, we would find that $P=(1/2,1/2)$, $Q=(1/2+\epsilon+\tau, 1/2-\epsilon-\tau)$ and $P'=(1/2+2\epsilon,1/2)$ implies $\vec\xi=(\tau-\epsilon,-\epsilon-\tau)$ and thus, for $\tau<\epsilon$ this gives $\Phi=(-1,-1)$ and $\Phi'=(0,0)$. This gives $$(Q-P)[\Phi]=0 \quad\text{and}\quad (Q-P)[\Phi']=0$$ where $P[\Phi]=\mathbb E_{x\sim P}[\Phi(x)]$ is the expectation value. A counter example. However, adding in the normalization constraint makes things interesting. Clearly, $P'$ in the example is not normalized. Normalizing it by subtracting the $2\epsilon$ in the other dimension would blow up the TV distance between $P$ and $P'$ to $2\epsilon$ breaking the counter example.

More generally, look at the example with $\Phi$. Then (denoting $\|\vec\xi\|_1=2\xi$): $$\begin{equation} \begin{split} &(P-Q)[\Phi]=(P-P'+P'-Q)[\Phi]\\[10pt] &=\sum_{i:\xi_i>0}P_i-P_i'+\sum_{i:\xi_i>0}P_i'-Q_i - \sum_{i:\xi_i<0}P_i-P_i'-\sum_{i:\xi_i<0}P_i'-Q_i\\[10pt] &=\sum_{i:\xi_i>0}\vec\epsilon_i'-\sum_{i:\xi_i>0}\vec\xi_i -\sum_{i:\xi_i\leq0} \vec\epsilon_i'+ \sum_{i:\xi_i\leq0} \vec\xi_i = \underbrace{\sum_{i:\xi_i>0} \vec\epsilon_i' -\sum_{i:\xi_i\leq0} \vec\epsilon_i'}_{=:\vartheta} -2\xi \end{split} \end{equation}$$ This makes clearer: If $\vartheta$ is positive, this implies that $\vec\epsilon'$ and $\vec\xi$ "dominantly" share sign along individual directions. This in turn implies that along theach such dimension either $P_i'<Q_i<P_i$, $P_i'<P_i<Q_i$, $Q_i<P_i<P_i'$, or $P_i<Q_i<P_i'$. And since all three are distributions those cases must be somewhat "evenly distributed". In particularly, half the time it must hold that $P_i$ is between $P_i'$ and $Q_i$. Now, by standard $\ell_1$ distance arguments, it should hold that $\xi$ equals $\tau$ plus the $|\vec\epsilon_i'|$ of those $i$ where $P_i$ is between $Q_i$ and $P_i'$.

Maybe someone can help me make this intuition rigorous, or, similarly helpful, point me out where my mistake is at?

Best wishes and many thanks!!

1

There are 1 best solutions below

0
On

After all I realized that it is quite simple to come up with a counter example. In particular, its quite straight forward to generalize the counter example of the non-normalized case:

For a distribution $P$ and a function $\Phi$ denote by $P[\Phi]=\mathbb E_{x\sim P}[\Phi(x)]$ the expectation value. Then, lettin

  • $P=(\frac14,\frac14,\frac14,\frac14)$
  • $P'=P+(x,-x,0,0)$
  • $Q=P+(x-y,-x+y,2z,-2z)$

we obtain

  • $\Phi=(-1,1,1,-1)$

and

  • $P[\Phi]-Q[\Phi]=2(3z-x)$

Thus: Setting $x=3z$ we find $P[\Phi]-Q[\Phi]=0$ disproving the claim.