Minimum distance from a point to the hyperplane (using Kuhn-Tucker conditions)

1.5k Views Asked by At

Find the minimum Euclidean distance between a point $x_0 \in \mathbb{R}^{n}$ and the hyperplane $$\left\{x:w^{T}x=b\right\}$$ where $x \in \mathbb{R}^{n}$, $w \in \mathbb{R}^{n}$ and $b \in \mathbb R$.

I must formulate the problem as an optimization problem with constraints and deduce the Kuhn-Tucker conditions for this formulation.

1

There are 1 best solutions below

0
On

According to your notation, you can tackle the problem trying to minimize the objective function $f(x) :=(x-x_0)^T (x-x_0)$, i.e., the square of the Euclidean distance between $x$ and $x_0$, subject to the equality constraint $g(x) := w^Tx - b = 0$, i.e., $x$ belongs to the hyperplane.

The Kuhn-Tucker formulation of this problem is specified by the following equations: $$(i)\;\; \nabla f(x) = \lambda\nabla g(x);$$ $$(ii)\;\; g(x) = 0;$$ where $\nabla h(x) = (\partial h(x)/\partial x_1,\ldots,\partial h(x)/\partial x_n)\in\mathbb{R}^n$ denoted the gradient of any function $h:\mathbb{R}^n\to\mathbb{R}$ and $\lambda\in\mathbb{R}$ is a scalar.

Whenever $x^*$ is a local optimum, it must fulfill these equations. While equation $(ii)$ is just the equality constraint, the rationale behind the equation $(i)$ can be understood as follows:

If $x^*$ fulfills equation $(i)$ then $\nabla f(x^*)$ and $\nabla g(x^*)$ are collinear, meaning that $\nabla f(x^*)$ is perpendicular to the hypersurface $G$ of points fulfilling the constraint $g(x)=0$ locally at $x^*$, hence the direction where $f$ reaches its highest rate of change locally at $x^*$ (i.e., its gradient at $x^*$) is perpendicular to $G$ at $x^*$. In the case of a local optimum $x^*$ this is the case given sufficient regularity conditions, e.g., $f$ and $g$ continuously differentiable at $x^*$.

It can be readily seen that $\nabla f(x) = 2(x-x_0)$ and $\nabla g(x) = w$. Consequently, the Kuhn-Tucker equations of the problem can be written as: $$(i)\;\; 2(x-x_0) = \lambda w;$$ $$(ii)\;\; w^T x = b.$$

Solving for $x$ in equation $(i)$ yields $x=\frac{\lambda}{2} w + x_0$. Plugging the expression of $x$ in equation $(ii)$ leads to $$\frac{\lambda}{2}w^Tw + w^Tx_0 = b,$$ which is then solved for $\lambda$, $$\lambda = \frac{2}{w^Tw}(w^Tx_0 - b).$$ Finally, plugging the value for $\lambda$ in the expression of $x$ will lead to the unique local optimum $x^*$, $$x^*= \frac{w^Tx_0-b}{w^Tw}w+x_0.$$

Finally, the minimum distance is given by the distance between the local optimum $x^*$ and $x_0$, i.e., $$d(x^*,x_0) = \sqrt{(x^*-x_0)^T (x^*-x_0)} = \left|\frac{w^Tx_0-b}{w^Tw}\right|\sqrt{w^Tw}.$$