The step of swaping terms in the Rademacher bound

59 Views Asked by Bumbble Comm At 07 Apr 2026 - 3:09

On page 28 here: https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec14.pdf the step marked with "why?" is not entirely clear to me. Can someone please explain?

I see that the effect of introducing the Rademacher variables is to randomly swap the terms of the differences inside the sum, and it doesn't change the distribution of the sum. But it can change the g that achieves the supremum, so why is it that in expectation the supremum remains the same? (Would this trick work with any function in place of 'sup'?)

Original Q&A

There are 1 best solutions below

Bumbble Comm On 19 Dec 2017 - 11:40 BEST ANSWER

Correct, the supremum is not important here. It is just a complicated case of "if $X$ and $Y$ have the same distribution then $E_X[f(X)]=E_Y[f(Y)]$." In your example, $X=(X_g)_{g \in \mathcal{G}}$ where $X_g = \frac{1}{m} \sum_i (g(Z'_i) - g(Z_i))$ and $Y=(Y_g)_{g \in \mathcal{G}}$ where $Y_g =\frac{1}{m} \sum_i \sigma_i (g(Z'_i) - g(Z_i))$.

The step of swaping terms in the Rademacher bound

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions