Conditional Second Moments of Multivariate Normal Variable on Binary Vectors

129 Views Asked by At

Suppose we observe a binary table $Y \in \mathbb R^{N \times G}$, corresponding to $N$ observations of $G$ dimensional binary vectors $Y_1, \cdots, Y_n$. We imagine each vector $Y_i$ is generated from an unobserved multivariate normal vector $w_i \in \mathbb R^{k}$ through the following process:

  1. $W_i \sim N(0, I_k)$, where $I_k$ is the k-dimensional identity matrix.
  2. $U_i = BW_i$, where $B \in \mathbb R^{G \times K}$ is known. So $U_i$ is $G$ dimensional
  3. $Y_i \sim \text{Bernoulli}(\sigma(U_i))$, where $\sigma(x)= \frac{1}{1+e^{-x}}$ is the sigmoid(logistic) function applying to $U_i$ component-wise.

Now denote $W= [W_1^T, \cdots, W_n^{T}] \in \mathbb R^{N \times K}$, and I am interested in computing the conditional second moments of $W$ on Y, e.g.: $$\mathbb E[W^{T}W |Y]$$

Here is my current approach:

To proceed the computation, I think it would be easier to write $$\mathbb E[W^{T}W |Y] = E[W|Y]^{T} E[W|Y] + \text{Cov}(W|Y)$$

  • To compute $E[W|Y]$, we just need to run a series of penalized logistic regressions to compute its posterior mean, as $B$ is known.
  • However, I find it difficult to compute the conditional covariance of $W_i$ on $Y_i$. If $Y_i$ is also normal, $W_i$ and $Y_i$ would be jointly Gaussian, and we can compute the conditional variance using the conditional variance formula for Gaussian. Is there an easy way to compute this entity when $Y_i$ is discrete?

Thank you so much and happy holidays!