How is the union bound applied in this proof?

355 Views Asked by At

From Understanding Machine Learning: From theory to algorithms:

How explicitly is the union bound used in the proof of this theorem to get the result in the red box below?

enter image description here

1

There are 1 best solutions below

4
On

Let your events be $A_n = \{ |L_D(h) - L_S(h)| < \epsilon_n$ holds for all $h \in \mathcal{H}_n \}$.

$P(A_n^c) < \delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(\bigcap_{i=1}^n A_n) = 1 - P(\bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(\bigcup_{i=1}^n A_n^c) \leq \sum_i P(A_i^c) \leq \sum_i \delta_i$, so $P(\bigcap_{i=1}^n A_n)$ is at least $1 - \sum_i \delta_i$.


Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $\Omega = \mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have $A_n = \{ S \in \mathcal{D}^n: |L_D(h) - L_S(h)| < \epsilon_n$ holds for all $h \in \mathcal{H}_n \}$.