I'm not a statisticians so please forgive me if I posed a silly question, but it's a real problem for me in my research. Suppose we have defined risk in a regression problem as $R(f)=\int l(f(x),y) p(x) dx$ in which we want to estimate $f$ based on the data.The function $l$ is loss functions which measures the discrepancy between output of the function $f$ and provided label $y$ for every $x$.
If we sample i.i.d then it's possible that we substitute $p(x)={1/n} \Sigma^n_{i=1} \delta(x-x_i)$. The function $\delta$ is the delta Kronecker function. Result will be Empirical Risk minimization. i.e. $R(f)={1/n} \Sigma^n_{i=1} l(f(x_i),y_i) $. But if our samples are not i.i.d, then it's not possible to write $p(x)$ as above, so ERM in this simple form is not possible.
The problem is what can we can do in the case of non-i.i.d samples. More specifically, if our samples are not independent, what can be done? how to write $p(x)$ similar to the one above that considers the effect of non-i.i.d samples? If I have any miss-understanding or if I posed the questions wrongly please comment. Also, is there any source or material directly related to this problem in literature.
Generating samples from distributions that don't have a simple form is a common problem in many fields, and so it has been studied a lot. A few techniques you might want to look into are,
It sounds like importance sampling would be the best for you since you have a way to generate samples from an approximation, but it will depend on more details from the problem.
There are some pretty good explanations about each of these methods on youtube if you search for it.