definition of expectation of $f(X,Y)$ with $X$ continuous and $Y$ discrete (no independence)

264 Views Asked by At

Let $X$ and $Y$ be continuous and discrete random variables in $\mathbb{R}$ and $D$, respectively. ($D$ is a finite set.) Let $f: \mathbb{R}\times \mathbb{R} \to \mathbb{R}$. Then what is the definition of $$ \mathbb{E}[f(X,Y)]? $$

First of all, can we define a function $p = p_{X,Y}$ such that $$ \int_{A} \sum_{B} p(x,y) dy = \mathbb{P}(X \in A, Y \in B)? $$

If such $p$ indeed exists, may we define $$ \mathbb{E}[f(X,Y)] = \int_{\mathbb{R}} \sum_{y \in D} f(x,y) p(x,y) dy $$

1

There are 1 best solutions below

2
On BEST ANSWER

An unnecessarily long preliminary. The mathematical definition of $\mathbb{E}[Z]$ for a random variable $Z$ on the probability space $(\Omega, \mathcal{F}, \mathbb{P})$ is the integral

$$ \mathbb{E}[Z] := \int_{\Omega} Z(\omega) \, \mathbb{P}(\mathrm{d}\omega) \tag{1}$$

whenever the RHS exists. That is, $\mathbb{E}Z$ is nothing but the integral of the function $Z : \Omega \to \mathbb{R}$ w.r.t. the probability measure $\mathbb{P}$. Contrary to its intimidating appearance, upon ignoring technical details, the RHS is intuitively the Riemann sum

$$ \int_{\Omega} Z(\omega) \, \mathbb{P}(\mathrm{d}\omega) \quad {``}\approx\text{''} \quad \sum_i Z(\omega_i) \, \mathbb{P}(A_i)$$

where $A_i$ are ''infinitesimal events'' which partitions $\Omega$ and $\omega_i \in A_i$ is any sample in $A_i$. In this way, $\mathbb{E}[Z]$ is just the sum of values of $Z$ weighted by the probability, serving as a mathematical notion for average. Also we remark that there is a systematical way of formalizing this intuition into a mathematically rigorous way, though the price is that you need to be prepared with certain level of abstraction.

The fact that this definition does not assume any specific structure of $Z$ except the existence of the RHS is a two-bladed knife. On the one hand, it serves as a good starting point for building a general theory. On the other hand, it is not always easy to extract specific information on $\mathbb{E}[Z]$ directly from this definition. So it is often good to have some specific expression for $\mathbb{E}[Z]$ provided we are given some information on the law of $Z$. Indeed, the change of variables shows that

$$ \mathbb{E}[Z] = \int_{\mathbb{R}} z \, \mathbb{P}(Z \in \mathrm{d}z) \tag{2}$$

where $\mathbb{P}(Z \in \cdot)$ is the law of $Z$. Again, you need not be intimated by this mysterious formula. This is simply saying that you can choose your partition for the Riemann sum according to the value of $Z$:

$$ \int_{\Omega} Z(\omega) \, \mathbb{P}(\mathrm{d}\omega) \quad \approx \quad \sum_{i} z_i \, \mathbb{P}(Z \in (z_{i-1}, z_i]) \quad \approx \quad \int_{\mathbb{R}} z \, \mathbb{P}(Z \in \mathrm{d}z).$$

The RHS $\text{(2)}$ often reduces to a simpler expression when the law of $Z$ is nice. Here are two major examples:

  • If $Z$ is discrete, then $\mathbb{P}(Z \in \cdot) = \sum_{r \in \mathbb{R}} p_r\delta_{r}(\cdot)$ where $p_r = \mathbb{P}(Z = r)$ is the probability that $Z$ takes value $r$ and $\delta_r$ is the unit mass at $r$. Consequently, $$ \mathbb{E}[Z] = \int_{\mathbb{R}} z\, \left(\sum_{r} p_r \delta_r(\mathrm{d}z) \right) = \sum_r r p_r $$

  • If $Z$ is continuous, then $\mathbb{P}(Z \in \mathrm{d}z) = f(z)\, \mathrm{d}z$ for some function $f : \mathbb{R} \to [0,\infty)$ satisfying the condition $\int_{\mathbb{R}} f(z) \, \mathrm{d}z = 1$. Then $$ \mathbb{E}[Z] = \int_{\mathbb{R}} zf(z) \, \mathrm{d}z. $$

All these observations extends to more complicated cases without much hassle. In our case, $Z = f(X, Y)$ and the change of variables tells

$$ \mathbb{E}[f(X, Y)] = \int_{\mathbb{R}^2} f(x, y) \, \mathbb{P}(X \in \mathrm{d}x, Y \in \mathrm{d}y) $$

where $\mathbb{P}(X \in \mathrm{d}x, Y \in \mathrm{d}y)$ is the joint law of $X$ and $Y$. (And we note that this is true for any pair of random variables $(X, Y)$ provided $\mathbb{E}[f(X, Y)]$ exists.)

Answer to the question. Now let us focus on the case where $X$ is continuous and $Y$ is discrete and takes values in a countable set $D$. That is, $\mathbb{P}(Y \in D) = 1$.

Under the setting above, it easily follows that the joint law $\mathbb{P}(X \in \mathrm{d}x, Y \in \mathrm{d}y)$ on $\mathbb{R}^2$ is absolutely continuous w.r.t. the product measure $\mathrm{d}x\otimes\#_{|D}(\mathrm{d}y)$, where $\#_{|D}$ is the counting measure on $D$. Then by the Radon-Nikodym theorem, the joint law has density $p(x, y)$

$$\mathbb{P}(X \in \mathrm{d}x, Y \in \mathrm{d}y) = p(x, y) \, \mathrm{d}x\otimes\#_{|D}(\mathrm{d}y) $$

and hence

$$ \mathbb{E}[f(X, Y)] = \int_{\mathbb{R}} \sum_{y \in D} f(x, y) p(x, y) \, \mathrm{d}x. $$

This confirms OP's ansatz on the representation of $\mathbb{E}[f(X, Y)]$.