Imagine two spaces:
- An ‘input’ space with dimension $m$.
- An ‘output’ space with dimension $n$.
- $m \geq n$
There are points in each of these spaces defined such that some characteristic is defined. The characteristic is defined and valid in both spaces.
An example* might use RGB (Red, Green, Blue) values for the input and HSV (Hue, Saturation, Value) for the output.
(*the actual solution needs to generalize to accomodate arbitrary dimensions)
The characteristic of ‘greeny-ness’ is defined in the input space as the vector: $[0,255,0]$, and in the output space as: $[120,100,100]$.
‘red’, ‘black’, ‘yellow’ and ‘random colour that looks the same in both spaces’ could be similarly defined.
Imagine now that a limited subset of colors has been defined in this way – i.e. there are $p$-pairs of $m$-dimensioned vectors coupled with their corresponding $n$-dimensioned vectors ($m=n=3$ in this case).
The problem:
Given an arbitrary input vector, find (interpolate) the corresponding point in the output space that most exemplifies the 'characteristic' of that point (in the input space).
Using the color example, I might have all 8 corners of an RGB color cube defined as points on the input side – and their corresponding HSV values coupled with them as follows:
$$[0,0,0] \longleftrightarrow[0,0,0]$$$$ [255,0,0] \longleftrightarrow [0,100,100]$$$$ [0,255,0] \longleftrightarrow [120,100,100]$$$$ [255,255,0] \longleftrightarrow [60,100,100]$$$$ [0,0,255] \longleftrightarrow [240,100,100]$$$$ [255,0,255] \longleftrightarrow [300,100,100]$$$$ [0,255,255] \longleftrightarrow [180,100,100]$$$$ [255,255,255] \longleftrightarrow [0,0,100]$$
Given $[128,128,128]$ (‘grey’) as the input point in the input space I’d expect to be able to find [0,0,50] (‘grey’ in HSV) in the output space.
I know that I know that $[128,128,128]$ is right in the middle of the RGB cube with the Euclidean distances to all the 8 points being $ \frac{\sqrt{3}}{2} \times 256$. It's also worth noting that while each RGB values range over 8-bits (256 each ), the HSV values range over 360°, 100 & 100 respectively...
Yes, there are known RGB$\rightarrow $HSV routines - I just use this example as it is easy to visualize - but in the real application the dimensionality would more like 70 input parameters ($m=70$), mapping to 20 output parameters ($n=20$) and possibly up to 50 coupled points defined ($p=50$).
So far I’ve tried:
Using inverted Euclidean (or manhattan) norms found on the input side to inform weighted interpolations on the output.
Euclidean norms building simultaneous equations (‘hyper-spheres'!) that are solved using non-linear least squares (trilateration in higher dimensions and with over-fitting)
Using PCA dimensionality reduction on m to ensure $m=n$
Each of these has had practical success of sorts (especially if the p-coupled pairs are consistent within their spaces, and the more the better).
But there are always examples where the solution falls apart: eg. with $m=n=2$ and $p=4$ and the coupled vectors:
$$[0,0]_{\mathbf{i}_1} \longleftrightarrow[-10,10]_{\mathbf{o}_1}$$$$ [100,0]_{\mathbf{i}_2} \longleftrightarrow [10,-10]_{\mathbf{o}_2}$$$$ [0,100]_{\mathbf{i}_3} \longleftrightarrow [-10,140]_{\mathbf{o}_3}$$$$ [100,100]_{\mathbf{i}_4} \longleftrightarrow [110,110]_{\mathbf{o}_4}$$
(Note this is not an RGB>HSV example)
With least squares the solution (the black dot on the right-hand plot) to the input point: $[10,5]$ should be closer to the pre-defined point 'o1':
(Note how close the input (black point on the left) is to 'i1')
My ad-hoc patches simply lead me to chase my tail - so...
My questions:
While I’m aware that the nature of interpolation excludes precision, I ask:
What approach would be the closest I can get to a solution that generalizes for all dimensions, and inputs ? (within and outside the convex hulls defined in the $p$ coupled points)
Is there some other function of the input point with respect to the points pre-defined in the input space that I can glean information that would allow a more direct solution?
Is a direct and analytical approach even possible, or will I have to rely on measures of success via machine learning methods?


If, no one else answers the bounty (?) I'll go with my own lead:
https://en.wikipedia.org/wiki/Radial_basis_function_network