Confusion about the notation of Rademacher Complexity

88 Views Asked by At

The online book: https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf

In "Understanding Machine Learning:From Theory to Algorithms" by Shai Shalev-Shwartz and Shai Ben-David, Chapter 26 "Rademacher Complexity", part 26.1, for reasons of notational brevity, the set $\mathcal{F}$ is defined as: $$ \mathcal{F} := \mathcal{l} \circ \mathcal{H} := \{z \mapsto l(h,z): h\in \mathcal{H} \} $$

Where $l$ is a loss function, $\mathcal{H}$ is the hypothesis space, and $z$ refers to a tuple $(x,y)$.

I'm unsure of how to interpret this.

Does it mean "the set of functions that map $\mathcal{Z}$ to $\mathbb{R}$ via the loss function $l$ and any $h\in \mathcal{H}$?" If so, isn't this just the loss function $l$ and thus shouldn't the mapping be from $\mathcal{Z} \times \mathcal{H}$ to $\mathbb{R}$ instead?

Alternatively, could this mean "the set of functions $f_h:\mathcal{Z} \to \mathbb{R}$ where $h\in \mathcal{H}$"? But seeing as the dependency on $h$ is not expressed in subsequent parts of the literature, I have my doubts about this.

I would really appreciate it if someone could clarify this.

1

There are 1 best solutions below

0
On

You can think of $\mathcal{F}$ as the set of hypotheses with loss functions applied to their output. An $f(z) \in\mathcal{F}$ corresponds to a loss $l(h,z)$ applied to some $h\in\mathcal{H}$. Hence, $\mathcal{F}$ is all of the possible loss-hypothesis compositions.

You can substitute $f(z)$ in the book by Shai Shalev-Shwartz and Shai Ben-David with $l(h,z)$ and $\sup_{f\in\mathcal{F}}$ with $\sup_{h\in\mathcal{H}}$ to get a form with the hypothesis class explicitly included.