Change of variable for conditional probability

668 Views Asked by At

Suppose I have random variables $X$, $Y$ and $Z$, with $Z \sim N(0, \sigma^2)$ and $Y = kX + Z$, I am looking for a proof of the fact that$$f_{Y\mid X}(y\mid X = x) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left(\frac{-(y-kx)^2}{2\sigma^2}\right).$$

The only definition that I am aware for conditional distribution is that $$f_{Y\mid X}(y\mid X = x) = \frac{f_{X,Y}(x,y)}{f(x)}$$ and it is not obvious at all how the conclusion should follow from this definition. I suppose one should use some sort of change of variable formula.

Some updates:

I am asking this only because this formula is constantly used in statistics, for instance, it is used in section 3 probabilistic interpretation of this notes. But I feel quite confused about the way it is used and have no idea how it will follow naturally from the definition.

2

There are 2 best solutions below

1
On BEST ANSWER

Pretty different approach but same result.

It is understood that independence between X an Y is to be assumed.

$$Z=Y-kX \sim N(0;\sigma^2)$$

$$\frac{Z}{\sigma}=\frac{Y-kX}{\sigma} \sim \Phi$$

Given a fixed $X=x$ we have

$$\frac{Y-kx}{\sigma} \sim \Phi$$

this is enough to show that $Y|X=x \sim N(kx;\sigma^2)$, that means

$f_{Y|X}(y|x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2\sigma^2}(y-kx)^2} $

5
On

We assume that $X$ and $Z$ are independent. This is a common assumption in machine learning that we assume the noise $Z$ and the data $X$ are independent

If $X$ is known to take constant value $x$, then

\begin{align} Pr(Y\le y|X=x) &= Pr(kX+Z \le y|X=x) \\ &= Pr(Z \le y-kx|X=x)\\ &= Pr(Z \le y-kx) \text{, by independence}\\ &= Pr\left( \frac{Z}{\sigma} \le \frac{y-kx}{\sigma}\right)\\ &= \Phi\left(\frac{y-kx}{\sigma} \right) \end{align}

Hence $Y|X=x$ follows normal distribution with mean $kx$ and standard deviation $\sigma$.