im working through an example where X and Y are exponential random variables We are computing $ P(Y>X) $ the first step is $$ P(Y>X) = \int_{0}^{\infty} f_{X}(x)P(Y>X|X=x) dx $$
How does this step make sense rigorously, I think we are conditioning on each value of X and 'summing over all of values of X' and $f_{X}(x)$ represents the probability of each of those x's?
It is the
Total Law of Probabilityfor the continuous case. $$\mathbb P(A) = \int_{\Omega} \mathbb P(A|x)f(x)dx $$To understand this I like to think about it as an extension of the discrete case first:
$$\mathbb P(A) = \sum_{\Omega} \mathbb P(A|x)P(X=x(\omega)) $$
If a probability calculation can be broken into cases, the final answer is the weighted average of the answers for each case, weighing each case by the probability that it holds.