Fit a function f on dataset X such that f(X) fits a histogram

18 Views Asked by At

I have dataset $X=\{\boldsymbol{x_1},\boldsymbol{x_2},\dots,\boldsymbol{x_n}\}$ and $Y=\{y_1,y_2,\dots,y_n\}$ and want to learn a function $f$ such that $y = f(\boldsymbol{x})$ can be approximated as much as possible (by whatever cost measure). The hard part is that due to the limitation of data source, I have $y_i$ only accessible as an interval $(y_i^{(1)}, y_i^{(2)})$ which covers the possible range of the real $y_n$, rather than an exact point. What's more, those intervals don't overlap and not necessarily have equal size. To be more specific, they looks like this:

(1001, 2000), (2001, 4000), (1001, 2000), (4001, 6000), ...

Now I wonder if there is any technique suitable for such situation. Can I learn such a function $f$ that the $f(X)$ coincides $Y$ as much as possible. For example, $P[f(\boldsymbol{x}\in (1001, 2000)]$ should resembles $P[y_n\in (1001, 2000)]$.

Hope I make myself clear. If not, please point out the obscureness.


edited:

In other words, the problem is in regression / fitting a function, given some uncertainty in the inputs.