Joint CDF of two random variables - rectangular region equation

60 Views Asked by At

I just started learning about joint CDFs and I think I understand the concept, but I don't understand this equation in my textbook regarding a rectangular region. Why are the top right and bottom left corners summed together and the other corners subtracted? If anyone could provide some insight on why this equation makes sense I would appreciate it. Thank you in advance! link to graphic in my textbook

1

There are 1 best solutions below

0
On BEST ANSWER

It's a variant of the inclusion-exclusion principle. If we think about the CDF as representing the total "probability mass" to the left and down from any given point, then the mass within the rectangular region is given by the mass in the region $(-\infty, a] \times (-\infty, d]$, minus the masses in the regions $(-\infty, a] \times (-\infty, d]$ and $(a, b] \times (-\infty, c]$ (i.e. everything left of $u = a$, and everything between $a < u \leq b$ and below $v = c$). But then we can define the mass in that last region as being the mass in $(-\infty, b] \times (-\infty, c]$ minus the mass in $(-\infty, a] \times (-\infty, c]$. Putting that all together, we get $P({X, Y} \in R) = F(b, d) - F(a, d) - (F(b,c) - F(a, c))$ which is the same as the textbook answer once you re-arrange it.

If you'll forgive the crappy diagram, it's something like this:

enter image description here

Where the total "value" of the rectangular area is equal to the value of the whole red area (stretching to infinity at the bottom-left), minus the values of the blue and green areas, but plus the value of the area where they overlap so that it isn't double-counted.