Distribution of the difference of correlated Gumbel

303 Views Asked by At

Consider two random variables $X$ and $Y$, both distributed as a Gumbel with location 0 and scale 1.

Let $Z\equiv X-Y$.

We know that if the two variables are independent, then $Z$ is Logistic with location 0 and scale 1. Hence,

$$ \Pr(Z\leq z)=\frac{1}{1+\exp(-z)} $$

Suppose now that $X$ and $Y$ are correlated with correlation parameter $\rho$. Can we still write down a closed form expression for $ \Pr(Z\leq z)$?

2

There are 2 best solutions below

4
On BEST ANSWER

Do you mean that we only know that $X$ and $Y$ are both Gumbel(0,1), possibly dependent, and that their correlation coefficient is $\rho$?

In that case the answer is no, because those things do not uniquely determine the joint distribution of $X$ and $Y$ (nor that of $X-Y$). You can have different dependence structures that have the same $\rho$.

The following figure shows simulations from two different joint distributions. The top row has $X$ and $Y$ independent. The bottom row has them dependent but still $\rho=0$. The marginal distributions of $X$ and $Y$ are Gumbel(0,1) in both cases (second and third panels from left). The distributions of $X-Y$ are quite different (rightmost panels).

Two different Gumbel-Gumbel joint distributions

The dependent distribution on the bottom row is a mixture of two distributions $F_1$ and $F_2$:

  • The first has $X \sim \text{Gumbel}(0,1)$ and $Y=X$, so its correlation is $+1$. (Visually, here $X,Y$ are on the diagonal.)
  • The second distribution takes independent $X,Y$, then rejects the points where $(X-m)(Y-m)$ is positive, where $m = -\ln\ln 2 \approx 0.3665$ is the median of Gumbel(0,1). This second distribution has (empirically) correlation $\approx -0.56965$. (Visually, here $X,Y$ are in the left-upper and right-lower quadrants around the median point $(m,m)$.)

In both cases it is clear that the marginals are still Gumbel(0,1). With a suitable mixture $F = \alpha F_1 + (1-\alpha)F_2$, where $\alpha \approx 0.3629$, we get zero correlation.

One could also generate other mixtures of $F_1$, $F_2$ and the independent distribution, so $\rho$ can be varied and also different joint distributions can be generated for the same $\rho$. So in the end, to determine the distribution of $Z=X-Y$ one needs more information of the joint distribution than just the correlation.

0
On

This answer adds to the excellent answer by @JukkaKohonen, by focusing on the information required to find the distribution of the difference.

The probability distribution of the difference of continuous random variables can be calculated as an integral of the joint probability distribution

$$P[Z=z] = \int_{x=-\infty}^{\infty}P[X=x, Y=x-z]dx = \int_{y=-\infty}^{\infty}P[X=z+y, Y=y]dy $$

So, the resulting probability distribution for a given $z$ requires taking the marginal of the 2D joint distribution along a $45$ degree diagonal line, where $z$ is the vertical shift of that line. Yes, in theory, one does not require to know the whole 2D joint distribution, but only its 1D averages along this projection. However, this is quite a specific requirement. Maybe in your application getting this specific information is easier that the joint distribution, but in general it could be equivalently hard to just finding the 2D distribution and calculating the integral