Why is the Rademacher complexity/distribution named after Hans Rademacher?

140 Views Asked by At

In machine learning theory, the Rademacher complexity of a function class $\newcommand{\cF}{\mathcal{F}}\newcommand{\E}{\mathbb{E}}\newcommand{\R}{\mathbb{R}} \cF: X \mapsto \R$ over a particular set of inputs $x_{1:n} \in X^n$ is defined as $$ \operatorname{R}(F, x_{1:n}) = \frac{1}{n} \E \left[ \sup_{f \in F} \sum_{i=1}^m \sigma_i f(x_i) \right], $$ where $\sigma_i$ is a random variable distributed uniformly over $\{-1, +1\}$.

In machine learning literature [1], the Rademacher complexity is defined without etymology. The variable $\sigma_i$ is called a Rademacher-distributed random variable. However, in a biography of Rademacher [2], the words "distribution" and "random variable" do not occur. It seems that Rademacher was mainly a number theorist.

Why do these objects bear his name? From what I can tell, Rademacher did not introduce the definition $R$ above, and it seems strange to name $R$ after the distribution of $\sigma_i$ when $R$ itself is such a rich construct.


[1] Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. 2nd edition, MIT Press, 2018. https://www.dropbox.com/s/7voitv0vt24c88s/10290.pdf?dl=1

[2] Bruce C. Berndt. "Hans Rademacher (1892–1969)." Acta Arithmetica LXI.3, 1992. http://matwbn.icm.edu.pl/ksiazki/aa/aa61/aa6131.pdf.