Using Farkas' Lemma in proving that convex hulls of two data sets intersect if they are not linearly separable

506 Views Asked by At

In wikipedia, Farkas' lemma is as follows:

Let $A \in \mathbb{R}^{m\times n}$ and $b \in \mathbb{R}^n$. Then exactly one of the following statement is true:

  1. There exists $Y \in \mathbb{R}^n$ such that $AY = b$ and $Y\ge 0$.
  2. There exists $X \in \mathbb{R}^m$ such that $A^TX \ge 0$ and $b^TX \le 0$.

I am reading about linear separation here. In the last paragraph on page 2, it said that

Then, by Farkas’ lemma, if (3) is not feasible, i.e. if there does not exist $X$ such that $AX\ge U$, then there exists $Y$ such that $Y^TA = 0$, $Y^TU > 0$ and $Y\ge0$.

Below is equation (3): \begin{align*} \begin{cases} \langle a, x_i \rangle + b \ge 1&\text{if }y_i = 1\\ \langle a, x_i \rangle + b \le -1&\text{if }y_i = -1 \end{cases} \end{align*} Let $I = \{i| y_i = 1\}$ and $J = \{j| y_j = - 1\}$. In the matrix form, linear inequalities of (3) is equivalent to $AX \ge U$, where $A$ is $p \times (n+1)$ matrix whose $k-$th row is $[x_k^T\ 1]$ if $k \in I$ and $[-x_k^T\ -1]$ if $k \in J$, X is the concatenation of $a \in \mathbb{R}^n$ and $b\in \mathbb{R}$, and $U \in \mathbb{R}^p$, whose entries are all ones.

I do not get how "there does not exist $X$ such that $AX\ge U$" implies "there exists $Y$ such that $Y^TA = 0$, $Y^TU > 0$ and $Y\ge0$". Particularly, I do not understand how they get $Y^TA = 0$.

Note: I switched A and X in the second quoted paragraph above so that it corresponds with Farkas' Lemma.

1

There are 1 best solutions below

2
On BEST ANSWER

Take the generalized Farkas' lemma with a cone $C$ and its dual cone $C^*$:

  1. There exists $x \in \mathbb{R}^n$ such that $Ax = b$ and $x \in C$.
  2. There exists $y \in \mathbb{R}^m$ such that $A^T y \in C^*$ and $b^Ty < 0$.

Your $Ax \geq b$ falls in the first case by adding a slack variable: $Ax - s = b$, with $(x,s) \in C$ and $C = \mathbb{R}^{n+1} \times \mathbb{R}^k_+$. Since $C^*=\{0\}^{n+1} \times \mathbb{R}^k_+$, the second statement is: there exists $y \in \mathbb{R}^k$ such that $A^Ty=0$, $-y\geq 0$, $b^Ty<0$.