Determining the covariance

48 Views Asked by At

Let $X$ be a random variable that is uniform on $[0, 1]$ and let $X_k$ for $k = 1, 2, ..., n$ be a random sample of $X$. Define the random variable $Y$ and $Z$ as follows: $Y =$ the number of $i$ so that $X_i \geq \frac{1}{2}$ and $Z =$ the number of $i$ so that $X_i \geq \frac{3}{4}$. Determine $cov(Y, Z)$. Are $Y$ and $Z$ independent? Why or why not?

I know that the formula is $cov(Y, Z) = E(YZ) - E(Y)E(Z)$ but I don't know how to find these expected values.

I have no idea how to approach this problem. I have spent hours doing research but I cannot find anything useful. Any assistance is much appreciated.

3

There are 3 best solutions below

3
On

$Y\sim Bin(n;1/2)$ and $Z\sim Bin(n;1/4)$.

They are NOT independent and it can easy be proved observing, i.e. that

$$P(Y=n)=\frac{1}{2^n}$$

but

$$P(Y=n|Z=n)=1$$

To calculate $E(YZ)$ you can use the Law of total expectation

3
On

The first step is to compute the distributions of the random variables $Y$ and $Z$. Let's start with $Y$. Since $X_k \tilde U[0,1]$, $P(X_k \geq 1/2) = 0.5$ and $P(X_k < 1/2) = 0.5$. The random variable $Y=m$ if $m$ out of the $n$ $\{X_k\}$s are greater than 0.5. This is a classical binomial distribution.

$ P(Y=m) = {n \choose m} 0.5^m \; 0.5^{n-m} = {n \choose m} 0.5^n $

You can use standard formulae to conclude that $\mathbb{E}(Y) = 0.5n$. By the same token, $Z$ is also binomially distributed and $\mathbb{E}(Z) = 0.25n$.

Now, let's define a random variable $W = YZ$. To compute $\mathbb{E}(W)$, it helps to express:

$Y = \sum_{i=1}^n A_i$ and $Z = \sum_{i=1}^n B_i$, where $\{A_i\}$ and $\{B_i\}$ are indicator random variables. Specifically, $A_i = 1$ if $X_i \geq 1/2$ and $A_i = 0$ otherwise. Similarly, $B_i = 1$ if $X_i \geq 3/4$ and $B_i = 0$ otherwise.

Now, $W = \sum_{i=1}^n A_i \sum_{j=1}^n B_j = \sum_{i=1}^n A_iB_i + \sum_{i \neq j} A_iB_j$

Taking expectation: $\mathbb{E}(W) = \sum_{i=1}^n \mathbb{E}(A_i B_i) + \sum_{i \neq j}\mathbb{E}(A_i) \mathbb{E}(B_j)$.

Note that $A_i$ and $B_j$ are independent for $i \neq j$, but obviously $A_i$ and B_i$ are not independent.

Now, $A_iB_i = 1$ if $A_i=1$, $B_i=1$ (and 0 otherwise), which happens only when $X_i \geq 3/4$. This happens with probability $1/4$. Thus, $\mathbb{E}(A_i B_i) = 1/4$.

Also, note that $\mathbb{E}(A_i) \mathbb{E}(B_j) = 1/2 \times 1/4 = 1/8$.

Putting all of this together, you can compute $\mathbb{E}(W)$, and eventually $Cov(Y, Z) = \mathbb{E}(W) - \mathbb{E}(Y) \mathbb{E}(Z)$.

1
On

$Y$ is the count of samples that exceed $1/2$, and $Z$ the count that exceed $3/4$.

They are not independent because it is clear that $Y\geq Z$ is certain (any sample counted towards $Z$ will also count towards $Y$).

Indeed $Y-Z$ is the count of samples that exceed $1/2$ yet do not exceed $3/4$.

We shall use indicator random variables for the events of samples occurring in these mutually exclusive ranges. Let us define for all $i$: $U_i=\mathbf 1_{1/2\lt X_i\leq 3/4}$ and $V_i=\mathbf 1_{3/4\lt X_i}$. Since $(X_i)_{i=1}^n$ is a sequence of iid uniformly distributed random variables, $(U_i)_{i=1}^n$ is a sequence of iid bernoulli random variables, and $(V_i)_{i=1}^n$ likewise.

$$U_i\sim\mathcal{Bern}(1/4)\\V_i\sim\mathcal{Bern}(1/4)$$

Now for all $i\neq j$ we have $V_i\perp V_j$, $U_i\perp U_j$, and $U_i\perp V_j$; because the samples are independent, so are these indicators. On the other hand, we are certain that $U_iV_i=0$, since the indicated events are disjoint for any particular sample.

Finally, the purpose of this is that: $Z=\sum_{i=1}^n V_i$ and $Y-Z=\sum_{i=1}^n U_i$.

$$\begin{align}\mathsf{Cov}(Y,Z)~&=~\mathsf{Cov}(Z,Z)+\mathsf{Cov}(Y-Z,Z)\\&=~\mathsf{Var}(\sum_{i=1}^n V_i)+\mathsf{Cov}(\sum_{i=1}^n U_i,\sum_{j=1}^nV_j)\\&~~\vdots\end{align}$$