How to compare distances measured in spaces of different dimensionality

1k Views Asked by At

I would like to know if there is any way to compare distances measured in spaces of different dimensionality. Say we have two points in $\mathbb{R}^N$, two other points in $\mathbb{R}^M$ with $M \neq N$.

How can I say if the two points in $\mathbb{R}^N$ are closer than the two points in $\mathbb{R}^M$?

Taking a particular example, using euclidean distance alone seems "unfair" given that points in higher dimension will tend to have greater distances between them. Maybe that's just the way it is, but I wanted to know if one could somehow normalize for the dimensions or something.

Edit1: I add this here, it's in the comments, but to make it more visible.

I have a concrete application: I am working with Convolutional Neural Networks of varying architectures for image classification. At the end of the convolutional layers, there is a Mx1 (or Nx1, or (the corresponding dimension)x1) vector that in some way is an internal representation of the image for the neural net in a high dimensional space before the final classification. I want to feed the same images to different architectures and compare the distance in both representations. For example given an image of a Dog and another of a Cat, what's the distance between "cat" and "dog" in the different spaces generated by the different cnn?
I tried to keep it general in the original question, but that is what I am working on.

Thanks a lot in advance!

3

There are 3 best solutions below

8
On

Suppose $M<N$. Then you can embed $\mathbb{R}^M$ into $\mathbb{R}^N$ and any distance that you measured between points $p,q$ in $\mathbb{R}^M$ will be exactly the same distance you will measure between the the correspondiing embedded points in $\mathbb{R}^N$. More precisely, there is a (natural) isometric embedding $i:\mathbb{R}^M\to\mathbb{R}^N$ such that $d(p,q)=d(i(p),i(q))$ for every $p,q\in \mathbb{R}^M$. (Here $d$ is the Euclidean distance).

Therefore, the points in $\mathbb{R}^M$ are no closer and no farther away that they are in $\mathbb{R}^N$. The statement "points in higher dimension will tend to have greater distances" does not make much sense, because - as you can see, if these are the same points, in the sense that they were embedded into the higher dimensional space from a lower-dimensional one, then their distance is the same, whereas if they are different points, then there is no base of comparison.

2
On

The answer really depends on the use you want to make of this distance comparison.

If your context suggests that you take into account the fact that there are more ways/coordinates of points to differ in high dimensions then the answer from @uniquesolution won't work. You might consider instead letting the distance between two points be the average of the absolute values of the differences of coordinates. That scales the taxicab ($L_1$) metric by the reciprocal of the dimension.

You could scale any metric - perhaps the Euclidean distance (what you call normalizing for the dimension) by the reciprocal of the dimension.

2
On

... points in higher dimensions will tend to have greater distances between them

This is not true, and I think this is where you have a basic misconception.

The Euclidean distance between $(1,1)$ and $(2,2)$ in $\mathbb R^2$ is $\sqrt{1^2+1^2} =\sqrt 2$. And the Euclidean distance between $(1,1,0)$ and $(2,2,0)$, which are two equivalent points embedded in $\mathbb R^3$, is $\sqrt{1^2+1^2+0^2} =\sqrt 2$ - the same distance. And the distance between equivalent points embedded in $\mathbb R^4$, $\mathbb R^5$ etc. is still $\sqrt 2$. The extra terms under the square root make no difference since they are all zero.