There is a known Pearson Correlation between $X$ and $Y$, and there is a perfect rank correlation between $Y$ and $Z$, what is cor(X,Z)?

46 Views Asked by At

This is a problem that has arisen in building a simulation model. I want to control the marginal distributions of two random variables $Z$ and $X$, and separately control their (positive) correlation.

A practical solution I've been using is to define a third variable $Y$, which has a known positive Pearson correlation with $X$, by defining $Y$ as follows:

$$Y = X + \epsilon$$

Where $\epsilon$ is Gaussian noise with variance $\sigma^2_\epsilon$ chosen in order to get a pre-specified Pearson correlation between $Y$ and $X$.

Then, I create $Z$ with a perfect positive rank correlation with $Y$ by defining $Z$ as:

$$Z = F_z^{-1}(F_y(Y))$$

Where $F_z^{-1}$ is the inverse cumulative distribution function of $Z$ (which depends on the chosen distribution of $Z$), and $F_y$ is the cumulative distribution function of $Y$.

This seems to 'work', in that it allows me to control the marginal distributions of $X$ and $Z$ as needed in the simulation, and to manipulate the association between them. I can see this 'working' visually, in terms of the scatter plot of both variables in the simulation.

What I'd like to know is whether the rank-correlation between $X$ and $Z$ has a straightforward analytical solution under this procedure.

I'm not a mathematician, so apologies if the notation is clumsy. I'm sure there are more elegant solutions, although I'm working in a language with limited mathematical primitive (Netlogo).

Thank you in advance for any advice about this.

Edit: If it is any help, numerically, the Spearman correlation between $X$ and $Z$ appears to be very close to the Pearson correlation of $Y$ and $X$, although I don't know whether the expectation of both is the same. I'd like to be able to show this analytically.