Does anyone have know or have a link to a proof of why the VC-Dimension of Linear Classifiers in $\mathbb{R}^n$ is $n+1$? That is the set of $h_a : \mathbb{R}^n \rightarrow \{-1,1\}, h_a(b) = sgn(a \cdot b + k)$ where $a,b \in \mathbb{R}^n, k \in \mathbb{R}$. I tried some Google Fu and I've only seen people state it but not bother to prove it. The proof of $d \ge n+1$ is pretty straightforward but I can't figure out the $d < n+2$ part.
VC-Dimension of Real Linear Classifier Proof
5.7k Views Asked by Math is Hard https://math.techqa.club/user/math-is-hard/detail AtThere are 4 best solutions below
You can make use of the Radon's lemma which states that for every $A\subseteq R^n$ if $|A|\geq n+2$ there exists $B\subseteq A$ such that $convex\;hull(B) \cap convex\;hull(A\setminus B)\not = \emptyset$. Now pick any set in your problem $A\subseteq R^n$ of size $\geq n+2$. Let $B\subseteq A$. Assume by contradiction that $h$ is a linear half space that covers $B$ but no point in $A\setminus B$. Now clearly $convex \; hull(B)\subseteq h$ ( The intuition is since $h$ covers $B$ it will also cover the line connecting any two points in $B$). So we have $convex\; hull (A\setminus B)\subseteq \bar h$. Clearly we have that $h\cap \bar h = \emptyset$ but by Radon's lemma $convex\;hull(B) \cap convex\;hull(A\setminus B) \subseteq h\cap \bar h \not = \emptyset \rightarrow$ a contradiction.
You can take a look at the slides of Lecture 7 of Prof. Yaser Abu-Mostafa's Learning from Data course located at here. A very detailed proof and explanation are given there.
For the $ d < n + 2 $ part I know one proof which relays on the Steele Dudley Theorem.
Unfortunately, the only relevant online reference for this theorem that I managed to find is in Hebrew. But here it is just in case.
The Theorem States:
If $H$ is an hypothesis class where $Dim(H)=n$, then for every series of $n+1$ inputs $\{x_i\}_1^{n+1} $ , we can find a series of $n+1$ binary classifications ($\pm1$) $\{y_i\}_1^{n+1} $ s.t. the series $\{x_i\}_1^{n+1} $ cannot be classified perfectly by any classifier of the form: $sign(h(x))$.
In your case $H$ is all linear classifiers which are determined by $n$ weights and 1 offset. so $Dim(H)=n+1$.
additional reference