Need a proof for an assumption on conditional probability density function based on probability theory

215 Views Asked by At

While reading book "Elements of Information Theory", I came across an assumption used in a proof on page 33. The assumption is as follows.

Let $(X,Y)\sim p(x,y)=p(x)p(y|x)$. "If $p(y|x)$ is fixed, then $p(y)$ is a linear function $p(x)$. "

I can't understand the exact meaning of this "obvious" assumption. And could anyone familiar with Probability Theory could have a correct, concise and detailed proof on this assumption. Please show me in proof that 1) what does exactly fixed conditional density function mean, personally, I understand "fixed" does not mean a fixed constant but a fixed mapping from one set/space to another, but how to express it concisely? 2) what does "a marginal probability density function is a linear function of another marginal probability density function". Similarly, I think "linear" here does not easily mean $p(y)=c\cdot p(x)$, but how to express this linearity based on concept in probability theory?

p.s. One of my friends understood this assumption by Bayesian rule,

$p(y)=\frac{p(y|x)p(x)}{p(x|y)}$.

He thinks, if $p(y|x)$ is fixed then $p(x|y)$ is fixed as well as $\frac{p(y|x)}{p(x|y)}$. Thus, $p(y)$ is a linear function $p(x)$. However, I don't agree with him. And comments on this point?

1

There are 1 best solutions below

11
On

Some words might be missing and sloppy notations do not help but the linearity is the following.

Assume that the bivariate distribution $p$ of $(X,Y)$ is such that $p(x,y)=r(x)q(y\mid x)$ for every $(x,y)$ in the state space of $(X,Y)$, where $r$ is the distribution of $X$. (Then $q$ is the conditional distribution of $Y$ conditionally on $X$.) Let $s$ denote the distribution of $Y$. Fix some $q$. Then:

The mapping $r\mapsto s$ (distribution of $X$ maps to distribution of $Y$) is linear.

To wit, for every $y$, $s(y)=\sum\limits_xp(x,y)=\sum\limits_xr(x)q(y\mid x)$ hence, if $q$ is fixed, then $s=(s(y))_y$ depends linearly on $r=(r(x))_x$.