Shattering of a set of binary classifiers

89 Views Asked by At

Let $S$ be a set, and let $\mathcal{F}_{S}=\{f:S\to\{-1,+1\}\}$ be a set of different label assignments. Show that $\mathcal{F}_{S}$ shatters at least $|\mathcal{F}_{S}|$ subsets of $S$.

Here is what I attempted. Consider all maximum shattered subsets of $S$, denoted as $\{S_1,S_2,\cdots, S_{n}\}$, these subsets are maximum shattered in a sense that they can be shattered by $\mathcal{F}_{S}$ but any other subset not in $\{S_1,S_2,\cdots, S_{n}\}$ that includes any subset in $\{S_1,S_2,\cdots, S_{n}\}$ cannot be shattered. So the number of subsets of $S$ shattered by $\mathcal{F}_{S}$ is $\sum_{i=1}^{n}2^{|S_{i}|}-(n-1)$. We also know that for any element $x\in S\setminus(\cup_{i=1}^{n}S_{i})$, $f(x)$ has the same value for all $f\in \mathcal{F}_{S}$ otherwise $\{x\}$ should be a new shattered set and $x$ should be in $\cup_{i=1}^{n}S_{i}$. So the next question is to prove $|\mathcal{F}_{S}|\leq \sum_{i=1}^{n}2^{|S_{i}|}-(n-1)$, and this is the part I don't know how to prove. Can anyone give me some ideas? Thank you!

1

There are 1 best solutions below

0
On BEST ANSWER

We prove this by induction on $|S|$. The base case where $|S|=0$ is easy.

Assuming $|S|\ge 1$, let $s_0$ be a particular element of $S$, and let $T=S\setminus\{s_0\}$. Let $\newcommand\F{\mathcal F}\F_T$ denote the family of classifiers obtained when the classifiers in $\mathcal F_S$ are restricted to $T$. It might be the case that $|\F_T|<|\F_S|$. To be precise, let $k$ be the number of pairs of classifiers $\{f,g\}\subseteq \F_S$ for which $f(t)=g(t)$ for all $t\in T$, but $f(s_0)\neq g(s_0)$. Then $|\F_T|=|\F_S|-k$, because each pair has the same restriction on $T$.

By induction, we know that $\mathcal \F_T$ shatters at least $|\F_T|=|\F_S|-k$ subsets of $T$. $\mathcal \F_S$ shatters these same subsets, viewed as subsets of $S$. We will be done if we can show $\F_S$ shatters at least $k$ additional subsets. To see this, let $\F_T'\subseteq \F_T$ denote the $k$ classifiers on $T$ which were mapped to by the $k$ pairs of classifiers of $S$ that agree on $T$. By induction, we also know that $\F_T'$ shatters at least $k$ subsets of $T$. If we call let $A_1,\dots,A_k$ be the subsets of $T$ shattered by $\F_T'$, then it is easy to show that $\F_S$ also shatters the $k$ subsets of the form $$ A_1\cup \{s_0\},\dots,A_k\cup \{s_0\} $$ This bring the total number of shattered subsets to $|\F_S|$, completing the proof.