Given bivariate bernoulli with an integral as a parameter prove that are marginally identically distributed and correlation is positive

34 Views Asked by At

So this is a question from a past exam. The joint density function is

\begin{equation} \begin{aligned} P\left(X_1=x_1, X_2=x_2\right) & =I_{\{0,1\}}\left(x_1\right) I_{\{0,1\}}\left(x_2\right) \int_{-\infty}^{\infty} \theta^{x_1+x_2}(1-\theta)^{2-x_1-x_2} f(\theta) d \theta \\ \end{aligned} \end{equation}

The exercise also states that $\int_{\mathbb{R}} f(\theta) d \theta = \int^{1}_{0} f(\theta) d \theta = 1$. Important to mention that I don't know how to use this fact properly.

The first exercise is to show that $X_1$ and $X_2$ are identically distributed. I think I've managed to answer it correctly.

We can show that by marginalizing with regards to the other variable:

\begin{equation} \begin{aligned} P\left(X_1=1\right) & =P\left(X_1=1, X_2=0\right)+P\left(X_1=1, X_2=1\right)=\int_{-\infty}^{\infty} \theta(1-\theta)^1 f(\theta) d \theta+\int_{-\infty}^{\infty} \theta^2 f(\theta) d \theta \\ & =\int_{\mathbb{R}}\left[\left(\theta-\theta^2\right) f(\theta)+\theta^2 f(\theta)\right] d \theta=\int_{\mathbb{R}} \theta f(\theta) d \theta \\ P\left(X_2=1\right) & =P\left(X_1=0, X_2=1\right)+P\left(X_1=1, X_2=7\right)=\int_{\mathbb{R}} \theta(1-\theta) f(\theta) d \theta+\int_{\mathbb{R}} \theta^2 f(\theta) d \theta \\ & =\int_{\mathbb{R}} \theta f(\theta) d \theta \end{aligned} \end{equation}

And since the case of $X_i = 1$ can define a Bernoulli random variable, we can that they are both $Ber(p)$ where $ p(\theta) = \int_{\mathbb{R}} \theta f(\theta) d \theta $.

The second exercise is to show that their correlation is greater than 0.

My attempt is as follows:

Given that variances are always non-negative it suffices to prove that covariance is positive.

\begin{equation} \begin{aligned} \operatorname{Cov}\left(X_1, X_2\right) & =E\left(X_1 \cdot X_2\right)-E\left(X_1\right) E\left(X_2\right) \\ & =E\left(X_1 \cdot X_2\right)-P(\theta)^2 \\ E\left(X_1, X_2\right) & =P\left(X_1=1, X_2=1\right)=\int_{-\infty}^{\infty} \theta^2 f(\theta) d \theta \\ \operatorname{Cov}\left(X_1, X_2\right) & =\int_{-\infty}^{\infty} \theta^2 f(\theta) d_\theta-\left[\int_{-\infty}^{\infty} \theta f(\theta) d \theta\right]^2 \geqslant 0 \end{aligned} \end{equation}

This means that I should prove the next inequality probably using the mentioned fact about $f(\theta) $ but I can't find a way of isolating it. I also think that using C-S is an idea, but it doesn't boil down to something that makes sense for the exercise.

1

There are 1 best solutions below

1
On BEST ANSWER

You should first notice that

$$ \theta^{x_1+x_2}(1-\theta)^{2-x_1-x_2} = \theta^{x_1}(1-\theta)^{1-x_1} \theta^{x_2}(1-\theta)^{1-x_2} $$

and, using Bayes, we can write the joint density as

$$ P(X_1,X_2) = \int P(X_1,X_2 | \theta) f(\theta) d\theta = \int P(X_1 | \theta) P(X_2 | \theta) f(\theta) d\theta $$

where $P(X_i|\theta) = \theta^{x_i}(1-\theta)^{1-x_i}$ which (for $x \in \{0,1\}$) corresponds to a Bernoulli distribution. That means that $X_1$ and $X_2$ are Bernoulli when conditioned on $\theta$, and they are independent (when conditioned to $\theta$).

The first property ($X_1$ and $X_2$ are iid) is obvious because of the symmetry. The second is not obvious. But (again, using a Bayesian setting, i.e., treating $\theta$ as a random variable with density $f(\theta)$) we can write

$$E[X_1 ] = E[E[X_1 | \theta]] = E[\theta] $$

and

$$E[X_1 X_2 ] = E[E[X_1 X_2| \theta]] = E[\theta^2] $$

We know that, in general, $E[Z^2] \ge E[Z]^2$ (with equality only for a constant).

Hence $E[X_1 X_2 ] > E[X_1] E[X_2]$ (if $f(\theta)$ is non degenerate), and the covariance is positive.

The moral of the exercise is that two variables can be independent when conditioned to a third one, but not independent when not conditioned.

(It could also happen that the variables have negative correlation when conditioned, but positive correlation globally - cf Simpon's paradox)