Rademacher complexity definition

249 Views Asked by At

I am learning the Rademacher complexity and have a question on the definition of it. Let $\mathcal{H}$ be a hypothesis class and $m$ is the number of iid samples from a distribution $\mathcal{D}$, say, $S=\{x_i\}_{i=1}^m$. Then the Rademacher complexity is defined to be $$ R_s(\mathcal{H}) := \mathbb{E}_{\sigma} \sup_{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i \ell(h,x_i) \right|, $$ where $\sigma_i$'s are iid uniform on $\{-1, 1\}$. It seems that the more appropreate definition should be $$ \tilde{R}_s(\mathcal{H}) := \mathbb{E}_{\sigma} \sup_{h \in \mathcal{H}}\left| \frac{1}{|S_+|}\sum_{i \in S_+} \ell(h,x_i) - \frac{1}{|S_-|}\sum_{i \in S_-} \ell(h,x_i)\right|, $$ where $S_+ = \{i \in [m] | \sigma_i = 1\}$ and $S_+ = \{i \in [m] | \sigma_i = -1\}$. This is because the cross-validation should take the normalization constants into account. But I am not sure why the definition does not reflect the normalization constants.

Any comments/suggestions/answers will be very appreciated.

1

There are 1 best solutions below

3
On

The Rademacher complexity is not directly related to the cross-validation . If you want an intuitive explanation for RC think of it as the degree in which a family of functions can fit purely random noise .