Empirical Risk minimization, symmetrization lemma

93 Views Asked by Bumbble Comm At 09 Apr 2026 - 10:15

I have a question related to obtaining uniformly good estimates of error for the class of hypothesis function. The following images are taken from the paper: "The Complexity of Learning According to Two Models of a Drifting Environment".

Consider Z is a discrete random variable and P is some probability distribution on Z. Let's say we sample m Z's (IID), the corresponding measure will be $P^m$. Let $\mathscr{G}$ be some class of functions (can be countably infinite). For the Lemma 6 (in the attached image) each $\sigma_i$ defines a Rademacher distributed random variable (essentially taking value $\pm$ 1 with probability 0.5). (This trick is known as the permutation trick)

My question is on how to prove the following Lemma (6 in the attached paper).

Let $\vec z$ be a vector in $\mathbb{R}^m$ with the underlying distribution $\mathscr{P}^m$ where $\mathscr{P}$ is the distribution over individual $\vec{z}$ components. Now we generate another m elements sampled independently from $\vec{z}$ from the same distribution (also called ghost elements, or symmetrization trick), call them $\vec{u}$. Further assume let $\mathscr{U}$ be the probability measure over $\vec \sigma$: a uniform measure.

$ \mathscr{P}^{2m}(\{(\vec z, \vec u) : \exists g \in \mathscr{G} \mid{\sum_i(g(z_i) - g(u_i))}\mid \gt \eta m \}) \leq sup_{(\vec z, \vec u)}(\mathscr{U}\{(\vec z, \vec u, \vec \sigma) : \exists g \in \mathscr{G} , \mid{\sum_i\sigma_i(g(z_i) - g(u_i))}\mid \gt \eta m \})$

Following are the steps I tried:

Now for $g \in \mathscr{G}, g(z_i)$ is independent and identically distributed with $g(u_i)$. Therefore, the distribution for $g(z_i) - g(u_i)$ is symmetric (means $f_X(x) = f_X(-x)$ or $F_X(x) + F_X(-x)=1$). Therefore, when a Rademacher random variable $\sigma_i$ independent of $\vec z$ and $\vec u$ is multiplied with ($g(z_i) - g(u_i)$), it will be identically distributed.

$\sum_i(g(z_i) - g(u_i))$ is same distributed (also same support) with $\sum_i \sigma_i(g(z_i) - g(u_i))$, since each $(g(z_i) - g(u_i))$ is independent. Therefore $\mid{\sum_i(g(z_i) - g(u_i))}\mid$ is same distributed (also same support) with $\mid{\sum_i(g(z_i) - g(u_i))}\mid$.

Let $\tilde{\mathscr{P}}$ represents the probability measure over the space of the random variable $\mid{\sum_i(g(z_i) - g(u_i))}\mid$. Further assume let $\mathscr{U}$ be the probability measure over $\vec \sigma$: a uniform measure.

Hence $\mathscr{P}^{2m}(\{(\vec z, \vec u) : \mid{\sum_i(g(z_i) - g(u_i))}\mid \gt \eta m \}) = \tilde{\mathscr{P}}(\{(\vec z, \vec u, \vec \sigma) : \mid{\sum_i\sigma_i(g(z_i) - g(u_i))}\mid \gt \eta m \})$. Now conditioning the right side over the $(\vec z, \vec u)$, RHS becomes $ \mathbb{E}_{{\mathscr{P}^{2m}}}(\mathscr{U}\{(\vec z, \vec u, \vec \sigma) : \mid{\sum_i\sigma_i(g(z_i) - g(u_i))}\mid \gt \eta m \} \mid (\vec z, \vec u))$. Since expectation is always less than supremum, we will achieve the following inequality:

$ \mathscr{P}^{2m}(\{(\vec z, \vec u) : \mid{\sum_i(g(z_i) - g(u_i))}\mid \gt \eta m \}) \leq sup_{(\vec z, \vec u)}(\mathscr{U}\{(\vec z, \vec u, \vec \sigma) : \mid{\sum_i\sigma_i(g(z_i) - g(u_i))}\mid \gt \eta m \})$

However, this is shown for any $g \in \mathscr{G}$. But in the main Lemma, we need to show that

When we have for $\exists g \in \mathscr{G}$, it means union of all the events on G. Therefore, how can we show the required inequality?

Thanks!

Original Q&A

Empirical Risk minimization, symmetrization lemma

Related Questions in STATISTICS

Related Questions in UNIFORM-CONVERGENCE

Related Questions in MACHINE-LEARNING

Related Questions in PROBABILITY-LIMIT-THEOREMS

Related Questions in EMPIRICAL-PROCESSES

Trending Questions

Popular # Hahtags

Popular Questions