Conditional probability where the conditioning variable is continuous

1.2k Views Asked by At

Consider

  • a random variable $Y$ with finite support $\mathcal{Y}$

  • a random variable $X$ with cdf $G$ absolutely continuous with probability density function $g$ (i.e., $X$ is a continuous random variable with support $\mathcal{X}$)

  • all random variables are defined on the probability space $(\Omega, \mathcal{F}, \mathbb{P})$

For some $y\in \mathcal{Y}$, let $$ \mathbb{P}(Y=y| X)\equiv h_y(X) $$ where $h_y: \mathcal{X}\rightarrow [0,1]$

What is the function $$ (y,x)\in\mathcal{Y} \times \mathcal{X} \mapsto h_y(x)* g(x)\in \mathbb{R} $$ ? ($*$ denotes scalar multiplication)

Is it the joint probability density function of $(Y,X)$?

2

There are 2 best solutions below

1
On BEST ANSWER

There is no (traditional, i.e., non-impulsive) joint density for $(X,Y)$ because $Y$ is a discrete random variable. Nevertheless, the function $d(x,y) = P[Y=y|X=x]f_X(x)$ can be viewed "operationally" as having the same desirable features of a density, with the understanding that we "sum" over the $y$ variable, not integrate.

  • For example, we "sum out $y$" to get the marginal density for $X$: $$\sum_{y \in \mathcal{Y}} P[Y=y|X=x] f_X(x) = f_X(x)$$ This is distinct from "integrating out" the $y$ variable if there were a tranditional density $f_{XY}(x,y)$, i.e., the standard formula $f_X(x) = \int_{y=-\infty}^{\infty} f_{XY}(x,y)dy$.

  • Also note that we can "switch the conditioning" as desired: $$ P[Y=y|X=x]f_X(x) = f_{X|Y}(x|y)P[Y=y]$$

  • Finally, for any measurable set $A \subseteq \mathbb{R}^2$, if we let $1_{\{(x,y) \in A\}}$ be an indicator function that is 1 if $(x,y) \in A$, and zero else, then $$\boxed{P[(X,Y) \in A] = \sum_{y\in \mathcal{Y}} \int_{x=-\infty}^{\infty} P[Y=y|X=x]f_X(x) 1_{\{(x,y) \in A\}} dx}$$ Indeed \begin{align} P[(X,Y)\in A] &= \int_{x=-\infty}^{\infty} P[(X,Y)\in A| X=x] f_X(x)dx\\ &=\int_{x=-\infty}^{\infty} P[(x,Y) \in A|X=x]f_X(x)dx\\ &\int_{x=-\infty}^{\infty} \left(\sum_{y \in \mathcal{Y}} P[Y=y|X=x]1_{\{(x,y)\in A\}}\right) f_X(x)dx \end{align} and we can formally switch sums/integrals by Fubini-Tonelli for this non-negative function.

0
On

This relates to a concept I have though about a lot; what follows includes some opinions.

Let $f(x,y)=h_y(x)\cdot g(x)$. You would not call $f$ a joint density for $(X,Y)$. That term is reserved for the situation where $(X,Y)$ is absolutely continuous with respect to Lebesgue measure $\lambda$ on the plane. That is, we cannot say that $$ \mathbb P((X,Y)\in A)=\int_A f(x,y)\,d\lambda \tag{1} $$ However, $f$ certainly looks like a density in the following sense. Let $H^1$ denote $1$-dimensional Hausdorff measure on $\mathbb R^2$. Then for any measurable $A$, $$ \mathbb P((X,Y)\in A)=\int_A f(x,y)\,dH^1\tag{2} $$ In the situation (1), the support of $(X,Y)$ is a two-dimensional set, wheras in your situation (2), the support is a one dimensional set, a finite union of several lines $\{y\}\times \mathcal X$ for $y\in \mathcal Y$. In other words, if you generalize the notion of "density" to allow for other supports with other dimensions by integrating with respect to Hausdorff measure $H^d$, then you can call your function a (generalized) density. I think that we should adopt this convention, but as far as I know no one does.

Another example: if $X$ has the Cantor distribution, then the function $f(x)={\bf 1}(x\in C)$, where $C$ is the Cantor set, can be viewed as a $(\log_3 2)$-dimensional density function$^*$ for $X$. We then have $$ P(X\in A)=\int_A f(x)\,dH^{\log_3 2} $$ This illustrates that $X$ is uniformly distributed over the Cantor set, since the density function is constant.

$^*$ Perhaps you need a normalizing constant to make to make this integrate to $1$.