for finding the distribution of Y=T(X) ,why we need to compute the CDF of Y first

33 Views Asked by At

The standard way for finding $f_Y$ is like this:

  • find the CDF of $Y$ according to the distribution of $X$,so that we get $F_Y(y)$

  • $f_Y(y)=F^\prime_Y(y)$

but what i am confused is that why not compute the $f_Y(y)$ directly.

example:

$f_X(x)=x $ and $x\in[0,\sqrt2]$

$Y=2X$

$f_Y(y)=f(Y=y)=f(2X=y)=f(X=\frac{y}{2})=\frac{y}{2}$ and $y\in[0,2\sqrt2]$

and this is wrong because $\int_{0}^{2\sqrt2} \frac{y}{2} dy \neq1$

Why directly compute the $f_Y(y)$ is wrong?

2

There are 2 best solutions below

0
On

The misconception here is thinking that the probability density function $f_Y(y)$ is equivalent to a probability mass function $\mathbb{P}(Y=y)$. In fact, whenever we work with continuous RVs, $Y$, with densities $f_Y(y)$, it so happens that $\mathbb{P}(Y=y)=0$ for all singleton points (can you intuit why?). Densities are different objects, do not represent probabilities (i.e. there are valid density functions $f(x)$ that integrate to $1$ but $f(x)>1$ for some $x$—can you think of any familiar ones or make up an example?), and thus manipulations like $$f(Y=y)=f(2X=y) \text{ [...]}$$ are total nonsense.

To be sure: $\{Y=y\}$ is a set, shorthand for $\{\omega \in \Omega: Y(\omega)=y\}$ where $\Omega$ is the sample space but $f_Y(y)$ is a function defined on reals, not on sets, so the expression $f(Y=y)$ is meaningless and if you want to swap it for the mass function, then all such expressions are vanishing since $\mathbb{P}(Y=y)=0$ for all $y\in \mathbb{R}$ as previously noted, whenever $Y$ is a continuous RV with density $f_Y(y)$. These manipulations only work with the cumulative distribution function $$F_Y(y):=\mathbb{P}(Y\leq y)=\int_{-\infty}^y f(y)\mathrm{d}y.$$

There is a transformation method however that avoids the route of computing CDFs and differentiating. I will outline for this example. Let $W=g(Y)$ where $Y=2X$ and $g$ is any bounded (and measurable) function on all of the real line. Consider the expectation of $W$, $$\mathbb{E}(W)=\mathbb{E}(g(Y))=\int_{\mathbb{R}} g(y) f_Y(y)\mathrm{d}y=\int_{\mathbb{R}} g(2x) f_X(x)\mathrm{d}x$$ Now perform a substitution. If $u=2x$ then $$\int_{\mathbb{R}} g(y) f_Y(y)\mathrm{d}y=\int_{\mathbb{R}} g(u) \frac12 f_X(u/2)\mathrm{d}x$$ and since this holds for all bounded measurable functions $g$, we have that $$f_Y(y)=\frac12 f_X(y/2).$$ [I have ignored the bounds in your example, but the principle is the same with some extra care to change the limits of the integration during the substitutions.]

One can quickly check this coincides with the answer obtained by the CDF-derivative method, and in fact you should do so. Please comment for further clarifications or if you spot any errors/typos!

0
On

but what i am confused is that why not compute the $f_Y(y)$ directly.

We can, and often do, but a little more care has to be used. Remember that the pdf is the unsigned derivative of the CDF, so you have to take that into account.

Assuming $T$ is an invertable function, then we have:

$$\begin{align}f_Y(y) &=\dfrac{\mathrm d~~}{\mathrm d y}F_X(T^{-1}(y)) \\&= f_X(T^{-1}(y))\left\lvert\dfrac{\mathrm d~T^{-1}(y)}{\mathrm d y}\right\rvert\end{align}$$

(It is a little more complicated when $T$ is not invertable, but the same basic idea is involved.)


Take your example: $Y=T(X)$ where $T(x)=2x$ and $f_X(x)=x ~\mathbf 1_{x\in[0;\surd 2]}$

$$f_Y(y) {=\lvert \tfrac{\mathrm d y/2}{\mathrm d y}\rvert\cdot \tfrac y2\mathbf 1_{2x\in[0;\surd 2]} \\= \tfrac y4\mathbf 1_{y\in[0;2\surd 2]}}$$

And we see that indeed $\int_0^{2\surd 2} \tfrac y4\mathrm d y = 1$