What is the domain of this sum-of-errors random variable?

42 Views Asked by Bumbble Comm At 26 Mar 2026 - 11:09

From Understanding Machine Learning: Theory and Algorithms:

What does the phrase in the red box below mean in terms of set theory?

I see that it means for every $h \in H$ we have $D(|L_s(h) - L_D(h)| \le \epsilon) \ge 1 - \delta$.

But how is $|L_s(h) - L_D(h)| \le \epsilon$ a random variable?

If it's a random variable then in should be of the form $\{a \in A : X(a) \le \epsilon\}$ where $X = |L_s(h) - L_D(h)|$ and $A$ is the sample space.

But what in this definition is $A$?

I know that $L_D(h) = \Bbb E_{z \text{~}D}[l(h,z)] = \sum_{z \in Z}l(h,z)D(z)$ and $L_S(h) = \frac{1}{m}\sum_{z_i \in S} l(h,z)$ where:

$l(h,z)$ is a loss function, $D$ is the distribution on $Z$, and $S$ is a training set.