In machine learning, we want to train a model. While training, if the dimension of data is high, we have a problem (Curse of Dimensionality), so we want to reduce the dimension of our data.
Since we know $\mathbb{R}^n$ and $\mathbb{R}$ have the same cardinality. So we always have some space-filling curve that maps each point uniquely in both directions. i.e. we can always bijectively map any n-dimensional data to 1-dimensional. So what is the problem we have in the first place?
I come up with two problems that can still be there:
- With this space-filling curve map, we can't reduce the size of data. i.e., we have to increase the precision when we are writing it in 1-dimension.
- This is the place where I have doubts. I am thinking that while representing data in $\mathbb{R}^n$ we have more information than representing it in $\mathbb{R}$. There is some structure when we write data in $\mathbb{R}^n$ like which point is near to which point. that is lost in this map (space-filling curve) .i.e, the map is not homomorphic.
My question:
- Is it right what I am thinking?
- I don't know what information I am talking about in point 2. could you help me make it more rigorous is.
- Is there any other problem?
Example:
Suppose we have a training data with $x^i \in \mathbb{R}^n $ with label $y^i \in \mathbb{R}$ where $i \in \{1,2,..,N\}$. When we are training a neural network to fit this data. If we change the order of the bases i.e.
$$x^i = (x_1^i,x_2^i,..,x_n^i)$$
if we take
$$\tilde{x}^i = (x_{\rho(1)}^i,x_{\rho(2)}^i,..,x_{\rho(n)}^i)$$
where $\rho$ is some permutation function(same for all $i$). then the training of neural network doesn't affect. So when order doesn't matter. What if I transform all the data from $\mathbb{R}^n$ to $\mathbb{R}$? It should also don't matter. But it did matter. Otherwise, there is nothing like the curse of dimensions.
I think When we transform the data from $\mathbb{R}^n$ to $\mathbb{R}$, we lose some information or do something wrong. What is that?
The fact that $\mathbb{R}$ and $\mathbb{R}^n$ have the same cardinality is a seriously overestimated and quite frankly not really useful information. For example from a set theoretic perspective there is no difference between a sphere and a line, but I'm sure you agree that these sets (treated as metric subspaces) are quite different, and we would need a different machinery (model) to describe them.
So this boils down to the fact that once you look at $\mathbb{R}^n$ as a set, instead of richer structure (normed vector space) you lose important information. And indeed, the information we are mostly interested in is metric/topological data, sometimes algebraic structure.
Note that "good" transformations are possible and are being done. Like if we preserve metric (isometry) or at least topology (homeomorphism). But $\mathbb{R}^n$ is not homeomorphic to $\mathbb{R}^m$ for $n\neq m$, let alone isometric. One of the examples of such a "good" transformation is what you've given us: permutation of coordinates (but only if a fixed permutation is applied to the entire data set). It is an isometry from $\mathbb{R}^n$ to itself.
Also note that space-filling curves are not useful here as well, since these are not injective. And therefore you lose even more information: you now map different points to the same point. And also (arguably more important) space-filling curves generally behave wildly, they deform shapes beyond recognition, they are hard to compute and generally unstable to use. Maybe you can do some analysis after such reduction but I don't think anyone tried it in a serious way (I might be wrong though).
In general there is no way to reduce dimension of arbitrary set of data while preserving the distance between points. Hence the curse. Sometimes we do however apply a dimensionality reduction and just accept the fact that we lose some information. Like we slightly move points around, or maybe remove some points, so that the data set changes but the general shape is more or less the same. As long as the loss is not too big then we are fine with it. Note that these techniques are far from trivial.