Computing the expectation of random variable by marginalizing over the condtional expectation and prove with LLN

34 Views Asked by At

I am trying to estimate the expectation of $X$ by marginalize over $Y$ and the conditional expectation $E(X|Y)$ and prove that it converges to the true expectation of $X$ as the sample size approaches infinity using the law of large numbers :

  • $X_1, X_2, \ldots, X_n$ are the observed values of $X$.
  • $I(Y_i = y)$ is an indicator variable.
  • $E(X | Y = y)$ is the true conditional expectation which is constant.
  • $\hat E(X | Y = y)$ and $\hat E(X)$ are the estimated expectations from the sample.

Here is the proof : $ \hat{E}(X) = \sum_{ y} P(Y=y) \cdot \hat{E}(X | Y = y) $
$= \sum_{ y} \frac{Count(Y=y)}{n} \cdot \sum_{ x} x P(X=x | Y = y) $
$= \sum_{ y} \frac{Count(Y=y)}{n} \cdot \frac{\sum^{n}_{ i=1} xi I(Y=y)}{\sum^{n}_{ i=1} I(Y=y)}$
Apply LLN to Marginalized Estimator for $E(X)$:
$\lim_{n \to \infty} \hat{E}(X) = \sum_{ y} \lim_{n \to \infty} P(Y=y) \cdot \hat{E}(X | Y = y) $
$\lim_{n \to \infty} \hat{E}(X)= \sum_{ y} \lim_{n \to \infty} \frac{Count(Y=y)}{n} \cdot \frac{\sum^{n}_{ i=1} xi I(Y=y)}{\sum^{n}_{ i=1} I(Y=y)}$

  • The LLN states that $\lim_{n \to \infty} \frac{1}{n} \sum_{i=1}^n X_i = E(X)$.

  • Apply this to our case:
    $ \lim_{n \to \infty} \hat{E}(X | Y = y) = \lim_{n \to \infty} \frac{\sum_{i=1}^n X_i \cdot I(Y_i = y)}{\sum_{i=1}^n I(Y_i = y)} $

  • We can rewrite the numerator and denominator as:
    $ \frac{\lim_{n \to \infty} \sum_{i=1}^n X_i \cdot I(Y_i = y)}{\lim_{n \to \infty} \sum_{i=1}^n I(Y_i = y)}$

  • Since $E(X | Y = y)$ is constant, we can pull it out of the sum ( I am not sure if this mathematically correct):
    $ = E(X | Y = y) \cdot \frac{\lim_{n \to \infty} \sum_{i=1}^n I(Y_i = y)}{\lim_{n \to \infty} \sum_{i=1}^n I(Y_i = y)} $

  • The ratio of the sums becomes 1, and we're left with:
    $ = E(X | Y = y) $

Thus, the sample mean estimator for $E(X | Y = y)$ converges to the true conditional expectation $E(X | Y = y)$ as the sample size ($n$) approaches infinity.

Similarly for $ \lim_{n \to \infty} P(Y=y) = \lim_{n \to \infty} \frac{Count(Y=y)}{n}$

My question is; is the proof correct ? can I marginalize to obtain the expectation using the law of large numbers ?