Measuring the similarity between distance vectors

41 Views Asked by At

I am trying to measure the correlation between a probability distribution and a scalar value. For instance, I have the following:

Vector of values Corresponding Scalar
Vec 1 Scalar 1
Vec 2 Scalar 2
... ...
Vec n Scalar n

What I currently do is that, I calculate the wasserstein distance between each pair of vectors and calculate the difference between each pair of corresponding scalars, so the resulting is:

Wasserstiein Distance (x,y) Scalar Difference(x,y)
wass(vec1, vec2) scalar1 - scalar2
wass(vec1, vec3) scalar1 - scalar3
... ...
wass(vec n-1, vec n) scalar n-1 - scalar n

Then I calculate the correlation between these two streams (wass values) and (diff values). I use five different correlation coefficients to catch both linear and non-linear relations:

  1. Peasron's Coeff.
  2. Kendall Tau
  3. Spearman
  4. Distance Correlation.
  5. Maximal Information.

The results suggest some correlation between the wasserstein distance and the corresponding scalar vector.

My question is: Is this correlation process legit? Is there any references I can look into utilizing the same concept?

Edit: Regression is not possible in this context because the size of the vectors are not equal, so, this is a deficient data. Additionally, I don't have enough samples (vectors) for training a regression model.

That's why I am trying to avoid regression in this context.