Probability density confusion in $p(z|x)$ when $z = x + y$, is $p(z|x) = p_{y|x}(z-x) = p(y|x)$?

39 Views Asked by At

I am reading on conditional pdf and encountered this example below.

For random varaibles $x, y$, and $z$,

"If $z = x + y$, then $p(z|x) = p_{y|x}(z - x)$."

I tried to make sense of this with following reasons:

\begin{align} p(z|x) &= \frac{p(z, x)}{p(x)} \\ &= \frac{p(x+y, x)}{p(x)} \\ &= \frac{p(y, x)}{p(x)} \qquad\qquad\qquad \text{since p(x+y, x) = p(y,x)?}\\ &= p(y|x). \end{align}

My first confusion is for the notation:

(1) From above, it seems that $p_{y|x}(z-x) = p(y|x)$. But I don't know what the argument $z-x$ is there for. For example, is $p_{y|x}(z-x) = p_{y|x}(y)$, and $p(y|x) = p(y|x)(y)$? I do not understand why $z-x$ is used instead of $y$ and feel like I am missing some important concept here.

Now, the book says if $x$ and $y$ are independent, it becomes:

$$ p(z|x) = p_y(z-x). $$

(2) Again, I do not understand why the particular argument $z-x$ is used to denote this. Isn't that basically just $p(y)$? Are they different somehow?

Next, the example shows the same thing. When $x$ and $y$ are independent Gaussian random variables with mean $\mu_x$, $\mu_y$, and variances $\sigma_x^2$, $\sigma_y^2$,

$$ p(z|x) = p_y(z-x) = \mathcal{N}(z-x; \mu_y, \sigma_y^2). $$ and it says that $p(z|x)$ is the pdf of $z-x$ with mean $\mu_y$ and vraince $\sigma_y^2$. I don't understand why the author makes sure to distinguish between $p(y)$ and $p_y(z-x)$ except if there are some subtle differences I am not understanding at the moment.

Perhaps it's just a style? But since I am a novice, I cannot escape the feeling of missing something. Could anyone please clarify it or tell me what I am missing here?

2

There are 2 best solutions below

2
On BEST ANSWER

The author is abusing notation: using what looks like the same notation to mean two different things. Specifically, the author writes $p(z\mid x)$ to mean $p_{Z\mid X}(z\mid x)$ and writes $p(y\mid x)$ to mean $p_{Y\mid X}(y\mid x)$. (Note -- I'm using capital letters to denote the random variable and lower case to denote the value the random variable takes; I dunno if the author makes this distinction.) In essence, the author is being lazy and omitting subscripts when "the interpretation is clear". This can lead to confusion when you are not aware that this convention is being adopted.

If you are aware that this is what is going on, the derivations should make more sense. So read the first argument this way: If $Z=X+Y$, then $$\begin{aligned} p_{Z\mid X}(z\mid x)&=p(Z=z\mid X=x)\\& = p(X+Y=z\mid X=x)\\&=p(Y=z-x\mid X=x)\\&=p_{Y\mid X}(z-x\mid x)\end{aligned}$$ If $X$ and $Y$ are independent, then $p_{Y\mid X}(y\mid x) = p_Y(y)$ for any $x$, which explains the second conclusion: $$p_{Y\mid X}(z-x\mid x) = p_Y(z-x) \qquad \text{when $X$ and $Y$ are independent.}$$

0
On

$$\begin{align} p(z|x) &= \frac{p(z, x)}{p(x)} \\ &= \frac{p(x+y, x)}{p(x)} \\ &= \frac{p(y, x)}{p(x)} \qquad\qquad\qquad \text{since p(x+y, x) = p(y,x)?}\\ &= p(y|x). \end{align}$$

Your discussion is about random variables $X,Y,Z$ , values $x,z$. You introduced the symbol of $y$ without explaining it (eg $y=z-x$). Always explain new symbols, and only introduce them if it makes things clearer. Here, it does not.

You also confuse the joint distributions. Indeed the notation is a source of confusion. I recommend subscripting all such functions - don't leave it up to the value symbols.

$$\begin{align} p_{\small Z\mid X}(z\mid x) &= \frac{p_{\small Z,X}(z, x)}{p_{\small X}(x)} &&\text{definition of conditional probability} \\ &= \frac{p_{\small X+Y,X}(z, x)}{p_{\small X}(x)} &&\text{because }Z=X+Y \\ &= \frac{p_{\small Y,X}(z-x, x)}{p_{\small X}(x)} &&\text{change of variable} \\ &= p_{\small Y\mid X}(z-x\mid x). \end{align}$$