Law of total probability in multivariable setting

42 Views Asked by At

I am basically just trying to sanity check some calculations and get some clarification. Below I have coded up an example in R to give empirical support for the calcs. The example necessitates some understanding of the logit transformation and softmax.

Assume the following encodes the joint distribution for some binary outcome $Y$ and its distribution conditional on sex, smoking and vegetarianism.

library(data.table)
set.seed(1)
d_tru <- CJ(sex = 0:1, smk = 0:1, veg = 0:1)
g <- function(sex, smk, veg){
    -1 + 2 * sex - 1 * smk + 3 * veg +
    -1 * sex * smk - 3 * sex * veg + 1 * smk * veg +
    0.5 * sex * smk * veg
}
# probability of group membership
q <- rnorm(nrow(d_tru))
d_tru[, p_grp := exp(q)/sum(exp(q))]
# probability of outcome
d_tru[, p_y := plogis(g(sex,smk,veg))]
d_tru

   sex smk veg      p_grp       p_y
1:   0   0   0 0.04225019 0.2689414
2:   0   0   1 0.09498377 0.8807971
3:   0   1   0 0.03427561 0.1192029
4:   0   1   1 0.38968687 0.8807971
5:   1   0   0 0.10989996 0.7310586
6:   1   0   1 0.03479920 0.7310586
7:   1   1   0 0.12870098 0.2689414
8:   1   1   1 0.16540341 0.6224593

The variables p_grp and p_y characterise the probability of group membership, i.e. $Pr(sex,smk,veg)$ and $Pr(y,sex,smk,veg)$ respectively.

Say we are interested in $\mathbb{E}[y | sex]$. One way to get to this is via the law of total probability.

$$ \begin{aligned} Pr(y | sex = s) &= Pr(y | sex = s, smk = 0, veg = 0) Pr(smk = 0, veg = 0|sex = s) + \\ & \quad Pr(y | sex = s, smk = 0, veg = 1) Pr(smk = 0, veg = 1|sex = s) + \\ & \quad Pr(y | sex = s, smk = 1, veg = 0) Pr(smk = 1, veg = 0|sex = s) + \\ & \quad Pr(y | sex = s, smk = 1, veg = 1) Pr(smk = 1, veg = 1|sex = s) \end{aligned} $$

Operationalised, based on the known distributions, we get:

c("Pr(Y | sex = 0)" = 
    sum(d_tru[sex == 0, p_y] * d_tru[sex == 0, p_grp / sum(p_grp)] ), 
"Pr(Y | sex = 1)" = 
    sum(d_tru[sex == 1, p_y] * d_tru[sex == 1, p_grp / sum(p_grp)] ))

Pr(Y | sex = 0) Pr(Y | sex = 1) 
  0.7882179       0.5545841

which can be cross checked empirically from simulated data and appears to be correct:

n <- 1e7
i <- sample(1:nrow(d_tru), n, T, prob = d_tru$p_grp)
d <- d_tru[i]
d[, y := rbinom(.N, 1, p_y)]

c("Pr(Y | sex = 0)" = 
    sum(d[sex == 0, mean(y)] * 
    d[sex == 0, .(p_grp = .N/nrow(d[sex == 0])), keyby = .(smk, veg)]$p_grp ), 
 "Pr(Y | sex = 1)" = 
    sum(d[sex == 1, mean(y)] * 
    d[sex == 1, .(p_grp = .N/nrow(d[sex == 1])), keyby = .(smk, veg)]$p_grp)
)
Pr(Y | sex = 0) Pr(Y | sex = 1) 
  0.7881758       0.5541419 

my understanding (possibly incorrect) is that we can restate the above approach by multiplying both sides by $y$ and summing over $y$ as:

$$ \begin{aligned} \mathbb{E}(Y|sex=s) &= \mathbb{E}(Y|sex = s, smk = 0, veg = 0) Pr(smk = 0, veg = 0|sex = s) + \\ & \quad \mathbb{E}(Y|sex = s, smk = 0, veg = 1) Pr(smk = 0, veg = 1|sex = s) + \\ & \quad \mathbb{E}(Y|sex = s, smk = 1, veg = 0) Pr(smk = 1, veg = 0|sex = s) + \\ & \quad \mathbb{E}(Y|sex = s, smk = 1, veg = 1) Pr(smk = 1, veg = 1|sex = s) \\ &= \sum_{smk,veg} \mathbb{E}(Y|sex = s, smk, veg) Pr(smk, veg|sex = s) \\ &= \sum_{smk,veg} \mathbb{E}(Y|sex = s, smk, veg) Pr(smk|veg, sex) Pr(veg | sex) \end{aligned} $$

since $Pr(smk, veg|sex) = Pr(smk|veg, sex) Pr(veg | sex)$.

I am aware that $\mathbb{E}[Y|A] = \mathbb{E}[\mathbb{E}[Y|A,B]|A]$ and also that $\mathbb{E}[Y|A,B] = \mathbb{E}[\mathbb{E}[Y|A,B,C]|A,B]$. I am unclear whether the expression that I have made using the LOTP, can be put into an analogous LOTE form. Any comments relating to errors in my understanding and clarification of the restatement in terms of LOTE would be greatly appreciated.