When I read various journal articles related to machine learning, I often face integrals over distribution.
In an article I am reading now, for example, a risk function associated with distribution $\phi$ is defined by $$ R_i(\theta) = \int f_L(f_\theta(x),y) \, d\phi(x,y),$$ where
- $\mathcal{X}$ and $\mathcal{Y}$ are a feature space and a label space, respectively.
- $f_\theta:\mathcal{X}\to\mathcal{Y}$ is a given model parameterized by $\theta\in\Theta$,
- $f_L:\mathcal{X}\times\mathcal{Y}\to\mathbb{R}_{\ge0}$ is a loss function, and
- $\phi$ is the data generating distributions.
In addition to the above case and the others (even not related to this field), I have seen many times integration formulas over distributions. However, whenever I encountered them, I couldn't grasp what it is.
Rather, I am familiar with the following equation: $$ \int f_L(x) p_X(x) \, dx, $$ where $f_L(x)$ is a cost (or reward) function achieved by an event $x$, and $p_X(x)$ is a probability that an event $x$ occurred.
Can someone please let me know what the integral over a distribution means?
I think you lack relevant knowledge on Riemann–Stieltjes integral
In the Probability theory,the Expectation of Discrete distribution and continuous distribution can be written uniformly as Riemann–Stieltjes integral form, that is $$ E_X(f(x)) = \int f(x)dF(x)$$
where $F(x)$ is the cumulative distribution function of random variable $X$, you can roughly think that $dF(x) = p(x)dx$
I hope it is useful to you. Sorry, if it didn't help you.