Different notation in Expectation's integration in (Empirical) Risk Minimization

42 Views Asked by At

For the philosophy of Statistical Learning Theory, the Risk Minimization is the focus. We want to minimize $$ R(f)=\mathbb{E}[L(f(x),y)]=\int~L(f(x),y) ~dP(x,y), $$ where $P(x,y)$ is the "joint probability distribution".

Is it "joint probability density/mass function (pdf/pmf)" or "joint cumulative distribution function (CDF)" ?

I understand $R(f)=\mathbb{E}[L(f(x),y)]$, but I think the integral should be something like $$ \int_{\mathcal{X}\times\mathcal{Y}}~L(f(x),y)~p(x,y)~d?? $$

that $\mathcal{X}\times\mathcal{Y}$ is the space of $x,y$, but I don't know how to write the $d??$ thing.

It seems that the $\int~L(f(x),y) ~dP(x,y)$ that $dP(x,y)$ is not the so called "change of variable" in the undergrad calculus that $$ \mathbb{E}[f(X)]=\int_{-\infty}^\infty f(x)g'(x)~dx=\int_{-\infty}^\infty f(x)dg(x), $$ which is Riemann–Stieltjes integral, $g(x)$ is a CDF, $g'(x)$ is the pdf.

So I guess it is not Riemann-Stieltjes Integral (not the "change of variable (interation by substitution)"), but the Lebesgue Integral? However, I have very little knowledge in Lebesgue Integral and Measure Theory thing.