What is the distribution of distances between points defined by a multivariate Gaussian?

63 Views Asked by At

Assume points x and y in N dimensional Euclidean space (1-7 dimension for my application), with dimension values xj, yj, j=1:N.

The locations of x and y are described by multivariate normal distributions, with known mus and sigmas. We can assume the covariances are 0 if that helps, but I would prefer not to.

The question is, what is the distribution of distances between points x and y?

That is, if I were to sample x and y from their respective distributions over and over, what would the distribution of distances be?

I can do this via simulation or integration, but speed of calculation is important, so I'm hoping there is a simpler solution (something like a gamma).

A bit more detail:

I'm looking for a solution when distance is defined in two ways:

dxy = w1 |x1 - y1| + w2 |x2 - y2| + ... + wN |xN - yN|

or

dxy = sqrt(w1 (x1 - y1)^2 + w2 (x2 - y2)^2 + ... + wN (xN - yN)^2)

where 0<=wj<=1 and sum(wj) = 1. Note the addition of the w's to the standard distance equations. These are meant to "stretch" or "shrink" the space along dimensions.

The full problem has an additional step. That is, I would like to transform this distance via sxy = exp(-c dxy), where c is a parameter. So, ideally, I'd love to know how this exponential is distributed.

Thank you in advance!

1

There are 1 best solutions below

2
On

(Note: I will only refer to the Euclidian distance). If we name your points $\textbf{X}$ and $\textbf{Y}$ with $\textbf{X}=[X_1, X_2, ..., X_n]$ and $\textbf{Y}=[Y_1, Y_2, ..., Y_n]$, then for every $i=1,...n$, $(X_i-Y_i)$ is a Gaussian random variable, if $X_i, Y_i$ are independent or jointly Gaussian. Considering that the weights $(w_i)_{i=1}^{n}$ are deterministic, then $\forall i=1,...n$ the random variable $Z_i=w_i(X_i-Y_i)$ is also a Gaussian random variable. If all $(Z_i)_{i=1}^{n}$ are independent and identically distributed, then the sum $$\sum_{i=1}^{n}Z_i^2=\sum_{i=1}^{n}(X_i-Y_i)^2$$ follows the Gamma distribution. Then, the random variable $D$ (short for distance), which is your interest, is a random variable defined as the square root of a Gamma variable. $$D=\sqrt{\sum_{i=1}^{n}(X_i-Y_i)^2}$$ This follows a distribution called Nakagami Distribution, see more here Square root of a Gamma distribution