Dependence of a r.v. $Z_{i,j} = X_{i} + Y_{j}$ with $X$ and $Y$ independent

53 Views Asked by At

Say we draw a random sample of size $N$ of iid rv $X$ and a random sample of iid rv $Y$. What is the level of dependence (say. measured by Pearson's correlation) of a random variable $Z_{k} = Z_{i,j} = X_{i} + Y_{j}$ with $i=1..N$ and $j=1..M$, $k=M\cdot N$ and $k=(i-1)\cdot M+j$ if $X$ and $Y$ are independent? What would be the distribution of $Z$ if, say, $X$ and $Y$ were sampled from a normal distribution? (I assume that $Z$ constructed in this way are not independent, right?)

My feeling is that $\mathsf{Corr}(Z_{i,j},Z_{k,l})= P(i = k) + P(j = l) = \frac{1}{N} +\frac{1}{M} $ but I am not sure if this is correct and how to prove it.

Added after comments

In the above I assume that $$\mathsf{Corr}(Z_{i,j},Z_{k,l}) = \mathsf{Corr}(Z_{i,j},Z_{i,l})\cdot P(X_i=X_k) + \mathsf{Corr}(Z_{i,j},Z_{k,j})\cdot P(Y_j=Y_l)+ \mathsf{Corr}(Z_{i,j},Z_{k,l})\cdot P(Y_j \neq Y_l \; and \; X_i \neq X_k)$$

with $\mathsf{Corr}(Z_{i,j},Z_{i,l}) = \mathsf{Corr}(Z_{i,j},Z_{k,j}) = 1$, but I am not sure if one can decompose correlation in that way. I might be missing the cross term in the above, i.e. the one corresponding to $P(Y_j = Y_l \; and \; X_i = X_k)$.

1

There are 1 best solutions below

1
On

When $i,j,k,l$ are mutually distinct indices, $X_i, Y_j, X_k, Y_l$ are mutually independent random variables and thus $Z_{i,j},Z_{k,l}$ are uncorrelated.†   So, as you said: $$\mathsf{Corr}(Z_{i,j}, Z_{k,l})=0$$

(† $\tiny\text{you can show this by the bilinearity of covariance: } \mathsf{Cov}(U+V,X+Y)=\mathsf{Cov}(U,X)+\mathsf{Cov}(U,Y)+\mathsf{Cov}(V,X)+\mathsf{Cov}(V,Y)$)

Because $X_h\perp Y_h$ (independence between the samples), therefore you can also demonstrate : $$\begin{align}0 & = \mathsf {Corr}(Z_{i,j},Z_{k,i})\\ &=\mathsf {Corr}(Z_{i,j},Z_{j,l})\\&=\mathsf {Corr}(Z_{i,j},Z_{j,i})\end{align}$$

Consider now $Z_{i,j},Z_{k,j}$ using the Bilinearity of Covariance, the identical distribution among each of the two samples, and the independence between them. $$\begin{align}\mathsf {Corr}(Z_{i,j},Z_{k,j}) & =\dfrac{\mathsf {Cov}(X_i+Y_j,X_k+Y_j)}{\sqrt{\mathsf{Var}(X_i+Y_j)\mathsf{Var}(X_k+Y_j)}}\\[2ex]& = \end{align}$$

And similarly you have $\mathsf {Corr}(Z_{i,j},Z_{i,l})$

Finally, is $\mathsf {Corr}(Z_{i,j},Z_{i,j})$ not obvious?