Consider the function $f\colon \mathbb{R}^2\to \mathbb{R}$, $f(x_1,x_2)=x_1^2 +x_2$. Assume that I don't know the form of $f$ and I only have a set of $N$ independent "input-output" data $\{(x_1^{(i)},x_2^{(i)}),\ f(x_1^{(i)},x_2^{(i)})\}_{i=1}^N$.
I would like to estimate a "right inverse" of $f$. Specifically, I want to estimate from data a function $g\colon \mathbb{R}\to \mathbb{R}^2$ such that $f(g(x))=x$, for all $x\in\mathbb{R}$.
In order to solve this problem I'm formulating the following nonlinear regression problem $$\tag{1} \label{eq:1} \min_{g\in\mathcal{G}} \|g(Y)-X\|, $$ where $\mathcal{G}$ is the set of continuous functions from $\mathbb{R}$ to $\mathbb{R}^2$, $X:=\begin{bmatrix}x_1^{(1)} & \cdots & x_1^{(N)}\\ x_2^{(1)} & \cdots & x_2^{(N)}\end{bmatrix}$, $Y:=\begin{bmatrix}f(x_1^{(1)},x_2^{(1)}) & \cdots & f(x_1^{(N)},x_2^{(N)})\end{bmatrix}$, and $g(Y)$ stands for $g$ applied elementwise to vector $Y$.
Let $\hat{g}$ be the minimizer of \eqref{eq:1}.
My question: Does $\hat{g}$ converge to a "right inverse" of $f$ for $N$ large enough? In other words, do we have $\hat{g}(f(x))=x$, $x\in\mathbb{R}$, for $N$ large enough?
If $f$ is a linear function then \eqref{eq:1} boils down to a linear regression problem and the answer to my question is in the affirmative. However I do not understand if this holds also for the nonlinear case (as in my simple example above). Numerical simulations suggest that this is not true, but I would like to understand why. (Note that, from a numerical viewpoint, I approximate $G(Y)$ using a finite set of basis functions.)
I'm sorry if my question is not very rigorous, but I've thought a lot about this problem with no luck. So, I would really appreciate any comment or feedback. Thanks!
Firstly, I don't think this qualifies as an inverse problem, see here for a definition.
Secondly, the problem here is that you have a non-injective function $f$ (since it maps $\mathbb{R}^2$ to $\mathbb{R}$, going from dimension 2 to 1). For example $f(1, 0) = f(-1, 0) = f(0, 1) = 1$. Therefore it would be impossible to find a function $g$ that would map back $1$ to three (at least) potential inputs. Indeed you would have: $$ g(1) = g(f(1, 0)) = g(f(-1, 0)) = g(f(0, 1)) $$ I would say you have no guarantees of convergence since you don't have any right inverse.
Thirdly, I am surprised that it "worked" for a linear function $f$. $f$ would still be non-injective and therefore the same problem would arise. What exactly did you mean by work in this case?
Finally, maybe you meant left-inverse and in this case the problem is a bit different. You would need $f$ to be surjective (which is the case in your example). We can talk more if you clarify this point.