Two sets can be separated by a hyperplane iff their convex hulls do not intersect

4.2k Views Asked by At

So I am supposed to prove that two sets of points in $\mathbb{R}^n$ are linearly separable in an n-dimensional space if and only if their convex hulls do not intersect.

I understand the concept visually in 2 and 3 space, it's just that I'm confusing myself in circles on the proof. I'm trying to prove by contradiction. Here is what I have:

Suppose there are two sets of data points in $\mathbb{R}^n$ of size $N$, between finite points $a_j<b_j$ and $c_j<d_j$: $$ \mathcal{S}_1: \{ p_i \big|\forall j,i,k: a_j < p_i < b_k\}, \mathcal{S}_2: \{ h_i \big|\forall j,i,k: c_j < h_i < d_k \} $$

With convex hulls $\mathcal{C}_1$,$\mathcal{C}_2$ such that: $$ \mathcal{C}_1: \bigg\{\sum_{i=1}^N \omega_i p_i \bigg| \forall i: \omega_i > 0 \wedge \sum_{i=1}^N \omega_i =1\bigg\}, \mathcal{C}_2: \bigg\{\sum_{i=1}^N \sigma_i h_i \bigg| \forall i: \sigma_i > 0 \wedge \sum_{i=1}^N \sigma_i =1\bigg\} $$

Which simplifies to: $$ \mathcal{C}_1: \{ \hat p \big|\forall i,\hat p,k: a_i < \hat p < b_k\}, \mathcal{C}_2: \{ \hat h \big|\forall i,\hat h,k: c_i < \hat h < d_k \} $$

Where: $$ \mathcal{C}_1 \cap \mathcal{C}_2 \neq \emptyset \wedge \bigg(\exists f(x) \big| f(x) = \sum_{n=1}^N \lambda_n x_n \wedge \forall \hat p,x,\hat h: \hat p < f(x) < \hat h \bigg) $$

It follows from $\mathcal{C}_1 \cap \mathcal{C}_2 \neq \emptyset $ that: $$ \exists \mathcal{D} \big| \mathcal{D}\subset\mathcal{C}_1,\mathcal{C}_2 \equiv \exists \hat p, \hat h \big| \hat p = \hat h $$

After this point I'm not sure what to do, I've tried converting it to vector math, but all I get is $p$ = $h$, and I don't know how to translate it to a contradiction. It's been a while since I've done any proof work, and I'm quite rusty. Is this heading in the right direction or should I trash it? Are my base assumptions correct? I would prefer direction, rather then answers because I want to pass this class.

update

I think i figured out my problem. I was defining the boundary with: $$ \forall \hat p,x,\hat h: \hat p \leq f(x) \leq \hat h $$

But I realize that if the hulls have $x\in \mathcal{C}_1, \mathcal C_2$ they cannot have a boundary that separates a point from itself. So the boundary equation must be: $$ \forall \hat p,x,\hat h: \hat p < f(x) <\hat h $$

1

There are 1 best solutions below

3
On

I don't know how to fix your proof, but I think you are overformalizing things (all these sums, etc.). Below there's an alternate take. (Note: I assume that $C_1$ and $C_2$ are convex hulls of finite sets of points.)

If two convex hulls are lineary separable, then it means that there is a half-space $H$ such that $C_1 \subset H$ and $C_2 \subset H^c$. Obviously $C_1 \cap C_2 = \varnothing$ because $H \cap H^c = \varnothing$.

The other direction is more tricky. Let $d$ be the distance between $C_1$ and $C_2$, i.e. $$d = \inf_{p_1 \in C_1,\ p_2 \in C_2}|p_1 - p_2|.$$ We are in $\mathbb{R}^n$ so $C_1$ and $C_2$ are compact, and thus bounded and closed. Therefore there exists $p_1$ and $p_2$ such that $|p_1 - p_2| = d$. The convex hulls do not intersect, so $d > 0$. Let $\vec{v} = p_1 - p_2$. We will show that the unique hyperplane $P \perp \vec{v}$ such that $p_0 = p_2 + \frac{\vec{v}}{2} \in P$ separates $C_1$ and $C_2$. In order to do so, we will show that $\mathrm{dist}(C_i,P) = \frac{d}{2}$.

Proof by contradiction. Fix any $i \in \{1,2\}$. Because of existence of $p_i$ we know that $\mathrm{dist}(C_i,P) \leq \frac{d}{2}$. Let's assume that there exists $q_i$ such that $q_i \in C_i$ and $\mathrm{dist}(q_i,P) < \frac{d}{2}$. Convexity of $C_i$ implies that whole line $L_i = p_iq_i \subset C_i$, however,

  • $L_i$ is not parallel to $P$,
  • $p_i \in L_i$.

Therefore, $L$ is not tangent to ball $B = B(p_0, \frac{d}{2}) \ni p_i$ with center at $p_0$ and radius $\frac{d}{2}$. Hence, there exists $r_i \in L_i \cap B$ such that $\mathrm{dist}(r_i,p_0) < \frac{d}{2}$, so by triangle inequality we have $\mathrm{dist}(r_i,p_{3-i}) \leq \mathrm{dist}(r_i,p_0) + \mathrm{dist}(p_0,p_{3-i}) < d$, contradiction.

I think you should be able to use those ideas in your proof, let me know how it goes ;-)