Relation between total variation and KS distance between measures on $[0,1]^d$

941 Views Asked by At

Let $P$ and $Q$ be two probability measures on the space $[0,1]^d$, $d \in \{1, 2, \ldots \}$, endowed with the $L_\infty$ norm and the corresponding Borel $\sigma$-field, $\mathcal{B}$. Let $$F_P(\mathbf{u})=P([\mathbf{0},\mathbf{u}]), \, \quad F_Q(\mathbf{u})=Q([\mathbf{0},\mathbf{u}]),$$ denote the distribution functions associated to $P$ and $Q$, respectively. Then, we have that $$ d_{KS}(F_P,F_Q):=\sup_{\mathbf{u}\in[0,1]^d} |F_P(\mathbf{u})-F_Q(\mathbf{u})| \leq \sup_{B \in \mathcal{B}}|P(B)-Q(B)|=:d_{TV}(P,Q). $$ My question is the following: assume $F_P$ and $F_Q$ are Lipschitz continuous, then does (some form of) converse inequality also hold true?

I was reasoning in this way: since $P$ and $Q$ are regular, for every $B \in \mathcal{B}$ and $\epsilon>0$ there exist closed sets $C_{B,\epsilon}^{(P)},C_{B,\epsilon}^{(Q)}$ and open sets $O_{B,\epsilon}^{(P)},O_{B,\epsilon}^{(Q)}$ such that $O_{B,\epsilon}^{(\bullet)} \subset B \subset C_{B,\epsilon}^{(\bullet)}$ and $$ P(C_{B,\epsilon}^{(P)}\setminus O_{B,\epsilon}^{(P)})\leq \epsilon, \quad Q(C_{B,\epsilon}^{(Q)}\setminus O_{B,\epsilon}^{(Q)})\leq \epsilon. $$ Whence, $ |P(B)-Q(B)| \leq 2 \epsilon + |P(O_{B,\epsilon}^{(P)})-Q(O_{B,\epsilon}^{(Q)})|. $ Yet, from now on it is not clear how to proceed. Maybe cover each open set with uniform metric-balls $\{B_1^\bullet,\ldots,B_{m_\bullet}^\bullet\}$ of radius $\delta$? Herein , we could maybe exploit the covering number inequality $m_\bullet \leq (3d/\delta)^d$. Observe that each ball is of the form $$ B_i^\bullet=\times_{j=1}^d(u_{i,j}^\bullet-\delta,u_{i,j}^\bullet+\delta), $$ where $\mathbf{u}_i^\bullet=(u_{i,1}^\bullet, \ldots, u_{i,d}^\bullet) \in [0,1]^d. $ In particular, by absolute continuity, we could choose $\delta$ such that $$ |F_P(\mathbf{u}_i^Q+\delta \mathbf{1})-F_Q(\mathbf{u}_i^Q-\delta \mathbf{1})|\leq \epsilon', \quad |F_Q(\mathbf{u}_i^P+\delta \mathbf{1})-F_P(\mathbf{u}_i^P-\delta \mathbf{1})|\leq \epsilon' $$ for some arbitrarily small $\epsilon'>0$. But still it is not evident to me that this could lead to a suitable upperbound encompassing $d_{KS}(F_p,F_Q)$. Do you have any clue?

1

There are 1 best solutions below

4
On BEST ANSWER

The reverse inequality does not hold for $d=1$ even when the CDFs are Lipschitz.

A) Infinite family of counterexamples:

Let $U_\epsilon$ denote the uniform measure supported on $A_\epsilon=[\tfrac{1-\epsilon}{2},\tfrac{1+\epsilon}{2}]\subseteq[0,1]$ for $0<\epsilon\leq 1$. Its CDF is continuous and even Lipchitz since the derivative $U_\epsilon$ is bounded. The CDF is given by, $$ F_{\epsilon}(x) = \begin{cases} 0 & x \in \left(-\infty,\tfrac{1-\epsilon}{2}\right],\\ \tfrac{1}{\epsilon}(x-\tfrac{1}{2})+\tfrac{1}{2} & x\in \left[\tfrac{1-\epsilon}{2},\tfrac{1+\epsilon}{2}\right],\\ 1 & x\in \left[\tfrac{1+\epsilon}{2},\infty\right). \end{cases} $$

First note that, $$ d_{TV}(U_\epsilon,U_1)\!=\!\sup_{B\in\mathcal B} |U_\epsilon(B)-U_1(B)|\!\geq\! |U_\epsilon(A_\epsilon)-U_1(A_\epsilon)| \!=\!1-\epsilon. $$

Next note that since $|F_1-F_\epsilon|$ achieves a maximum value at $x^*_{\pm}=\tfrac{1\pm\epsilon}{2}$. In particular, $|F_1(x^*_-)-F_\epsilon(x^*_-)|=F_1(x^*_-)=x^*_-$ and so, $$ d_{KS}(F_\epsilon,F_1) = \sup_{t\in[0,1]} |F_\epsilon(t)-F_1(t)| = F_1(x^*_-) = \frac{1-\epsilon}{2}. $$

We can make $d_{KS}(U_\epsilon,U_1)<d_{TV}(U_\epsilon,U_1)$ occur by demanding $\tfrac{1-\epsilon}{2}<1-\epsilon$, or equivalently, $\epsilon<1$. Thus, if we choose $P=U_\epsilon$ and $Q=U_1$, where $0<\epsilon<1$, then the CDFs are Lipschitz and $d_{KS}(F_P,F_Q)<d_{TV}(P,Q)$.

B) Counterexample with both measures supported on [0,1]:

To produce these counterexamples, we can simply perturb the previous ones. Consider the mixture of uniform measures $W_\epsilon = \epsilon U_1 + (1-\epsilon) U_\epsilon$. Note that $\operatorname{supp} W_\epsilon = [0,1]$ for all $0<\epsilon\leq 1$. Moreover, it's CDF is given by $$ G_\epsilon(x)=W_\epsilon([0,x])= \epsilon F_1(x) + (1-\epsilon) F_\epsilon(x). $$

Note that $|W_\epsilon-W_1| = (1-\epsilon)|U_\epsilon-U_1|$. Similarly, $|G_\epsilon-G_1| = (1-\epsilon)|F_\epsilon-F_1|$. Thus, $$ \begin{align*} d_{TV}(W_\epsilon,W_1) &= (1-\epsilon)d_{TV}(U_\epsilon,U_1), \\ d_{KS}(G_\epsilon,G_1) &= (1-\epsilon)d_{KS}(F_\epsilon,F_1). \end{align*} $$

By using our results in A), we obtain $$ \begin{align*} d_{TV}(W_\epsilon,W_1) &= (1-\epsilon)d_{TV}(U_\epsilon,U_1)\geq (1-\epsilon)^2, \\ d_{KS}(W_\epsilon,W_1) &= (1-\epsilon)d_{KS}(F_\epsilon,F_1)= \frac{(1-\epsilon)^2}{2}. \end{align*} $$

Thus, $d_{KS}(W_\epsilon,W_1)<d_{TV}(W_\epsilon,W_1)$ since $\tfrac{1}{2}(1-\epsilon)^2<(1-\epsilon)^2$ holds for all $\epsilon\neq 1$.