Why can this joint distribution function be written as this integral of a conditional distribution function?

59 Views Asked by At

In Section 3.8 of Dependence Modeling with Copulas (Joe), the author starts with the following.

In this section, we show how Sklar's theorem applies to a set of univariate conditional distributions, all conditioned on variables in an index set $S$. Sequential mixtures of conditional distributions lead to the vine pair-copula construction in Section 3.9. Shorthand notation used here includes: $\boldsymbol{x}_J=\left(x_j: j \in J\right)$ where $J$ is a subset of $\{1, \ldots, d\} ;\left(-\infty, \boldsymbol{x}_J\right)=\prod_{j \in J}\left(-\infty, x_j\right) ;\left(-\infty, \boldsymbol{x}_J\right]=\prod_{j \in J}\left(-\infty, x_j\right]$.

Consider $d$ random variables $X_1, \ldots, X_d$ with multivariate distribution $F$. Let $S$ be a non-empty subset of $\{1, \ldots, d\}$, which will be the conditioning set of variables. Let $T$ be a subset of $S^c$ with cardinality of at least two, which will be the conditioned set of variables. With $M=S \cup T$, we can write $$ F_M\left(\boldsymbol{x}_M\right)=\int_{\left(-\infty, \boldsymbol{x}_S\right]} F_{T \mid S}\left(\boldsymbol{x}_T \mid \boldsymbol{y}_S\right) \mathrm{d} F_S\left(\boldsymbol{y}_S\right) $$

I cannot figure out why this equality holds, and any help would be appreciated. Thank you for taking the time to consider my question!

1

There are 1 best solutions below

1
On BEST ANSWER

Since we are working with the cumulative distribution we have that

$$\begin{align} F_M\left(\boldsymbol{x}_M\right) & = \mathbb{P}\left(\boldsymbol{X}_M \leq \boldsymbol{x}_M\right) \\ & = \mathbb{P}\left(\boldsymbol{X}_T \leq \boldsymbol{x}_T \mbox{ and } \boldsymbol{X}_S \leq \boldsymbol{x}_S\right) \\ & = \mathbb{P}\left(\boldsymbol{X}_T \leq \boldsymbol{x}_T | \boldsymbol{X}_S \leq \boldsymbol{x}_S\right)\mathbb{P}\left( \boldsymbol{X}_S \leq \boldsymbol{x}_S\right) \\ & = \int_{\left(-\infty, \boldsymbol{x}_S\right]} \mathbb{P}\left(\boldsymbol{X}_T \leq \boldsymbol{x}_T | \boldsymbol{X}_S = \boldsymbol{y}_S\right) \mathrm{d} F_S\left(\boldsymbol{y}_S\right) \\ & = \int_{\left(-\infty, \boldsymbol{x}_S\right]} F_{T \mid S}\left(\boldsymbol{x}_T \mid \boldsymbol{y}_S\right) \mathrm{d} F_S\left(\boldsymbol{y}_S\right) \end{align}$$

The passage to the integral is a bit tricky, but compare to the discrete case to convince yourself. Also, see that the integration bound follows naturally from the condition $\boldsymbol{X}_S \leq \boldsymbol{x}_S$.