I am reading "Information theory and Statistics: A tutorial" by Csiszar and Shields.In Chapter 3, in the proof of theorem 3, it is considered there is some t<0 such that $P_t=tP+(1-t)P^*\in \mathcal{L}$, where $\mathcal{L}$ is a linear family of probability distribution. I am not able to show that for what value of $t<0$,$P_t$ is a probability distribution. Can anyone have some idea how to find or prove that for some $t<0$, $P_t$ is a probability distribution?
For what value of $t<0$,$P_t=t P+(1-t)P^*$ is a probability distribution if $P$ & $P^*$ are probability distributions
89 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
The crucial part for this is from Theorem 3.1, which shows that for an $I$-projection onto a convex family, the support of the projection equals the support of the family (implicitly working throughout for finite alphabets). I.e. if the family $\Pi$ is convex, and $P^*$ is such that $ D(P^*\|Q) = \min_{P \in \Pi} D(P\|Q)$ for some distribution $Q$, then for any $P \in \Pi,$ and alphabet symbol, if $P(a) > 0$ then $P^*(a) > 0$.
Now, linear families are convex, and, as geetha290krm notes, $P_t$ is a valid distribution if $P_t(a) \ge 0$ for any symbol $a$ (the other necessary properties should be obvious), or equivalently, if for every letter $a$, $ t (P^*(a) - P(a)) \le P^*(a).$ If $P^*(a) = 0,$ then $P(a) = 0,$ and this is trivially satisfied for any value of $t$. On the other hand, if $P(a) > 0,$ then $P^*(a) > 0,$ and so as long as $|t|$ is small enough, the condition will hold. In particular, the condition becomes $$ \min_{a : P^*(a) > P(a)} \frac{P^*(a)}{P^*(a) - P(a)} \ge t \ge \max_{a : P(a) > P^*(a) > 0}-\frac{P^*(a)}{P(a) - P^*(a)},$$ and for any $t$ in this range, $P_t$ is a distribution.
$P_t$ is countably additive and $P_t(\Omega)=1$ for any real number $t$. What is needed is $P_t(A) \geq 0$ for all $A$. This translates to $t (P^{*}(A)-P(A))\leq P^{*}(A)$ (or $|t| (P(A)-P^{*}(A)) \leq P^{*}(A)$) for any $A$. In general there is no guarantee that such a negative number $t$ exists.
For example, we might have $P^{*}(A)=0$ but $P(A) \neq 0$ (which means $P$ is not absolutely continuous w.r.t. $P^{*}$) In that case no such $t$ exists.
Here is one example to show that $t$ can exist in some cases: let $f(x)=1$ for $0<x<1$ and $0$ for all other $x$. Let $g(x)=\frac 1 3$ for $0 <x<1$ and $\frac 2 3$ for $1 <x<2$. Let $P$ have density $f$ and $P^{*}$ have density $g$. Take $t=-\frac 1 2$. You can check that $P_t$ is non-negtive in this case by noting that $3g \geq f$.