How to show equivalence between two programs?

77 Views Asked by At

Let

$$A := \left\{ (x_1,x_2,x_3) \in \mathbb{R}^3 \mid x_1+x_2+x_3 = 1 \right\}$$

and suppose that we want to minimize a function $J : \mathbb{R}^{3} \to \mathbb{R}$ subject to the constraint $y \in A.$

A version of gradient descent is to start with a guess $y_0 \in \mathbb{R}^3$ and then iterate as follows

$$y_{i+1} = \Pi_{A} \left( y_{i}-\lambda_i \nabla J(y_i) \right)$$

where $\Pi_{A}$ is the usual projection onto plane $A$. However, note that if we start with a point $y_0 \in A$, then the point

$$y_1 = y_{0}-\Pi_{B}[\lambda_0 \nabla J(y_0)] \in A$$

where

$$B := \left\{ (x_1,x_2,x_3) \in \mathbb{R}^3 \mid x_1+x_2+x_3 = 0 \right\}$$

This is because the sum of components of the vector $y_0$ is $1$ while the sum of the components from the other vector (which is a projection of the gradient) is $0$ and so resultant sum of components is still $1$. So $y_1 \in A$.

We can thus show by induction that $y_i \in A$ for all $i\geq 0$. I now want to argue that this update rule

$$y_{i+1} = y_{i}-\Pi_{B}[\lambda_i\nabla J(y_i)]$$

leads to the same answer as the one mentioned earlier. From our observation, we can say that

$$y_{i+1} = y_{i}-\Pi_{B}[\lambda_i\nabla J(y_i)] = \Pi_{A}( y_{i}-\Pi_{B}[\lambda_i\nabla J(y_i)])$$

So I guess that maybe we want to show that

$$\Pi_A(\Pi_B[\nabla J(y_i)])= \Pi_A(\nabla J(y_i))$$

but I am not sure if this is the right thing to do. Any suggestions would be much appreciated.

2

There are 2 best solutions below

1
On

In this case, it's best to use Lagrange Multipliers. For the function $f(x) = x_1 + x_2 + x_3 - 1$, we have $\nabla f = (1,1,1)$. A local maximum or minimum of $J(y)$ occurs at $\nabla J(y) = \lambda \nabla f(y) = (\lambda,\lambda,\lambda)$. So in order to find the minimal value of $J$, we simply solve for all $y$ such that all components of $\nabla J$ at $y$ are the same, and then compute each of these values to find which one is minimal.

0
On

Consider $y_0 \in A$, let $g_0 = \lambda_0 \nabla J (y_0)$, let $e$ be the all one vector.

We know the formula for projection onto $A$ is:

$$\Pi_A(z) = z- \left( \frac{z^Te-1}{3}\right)e$$

Also, the formula for projection onto $B$ is :

$$\Pi _B (z) = z- \left( \frac{z^Te}{3}\right)e$$

Note that we have

\begin{align}\Pi_A(y_0-g) &= (y_0-g_0)- \left( \frac{(y_0-g_0)^Te-1}{3}\right)e \\ &=(y_0-g_0)-\left( \frac{1-g_0^Te-1}{3}\right)e\\ &= y_0-g_0+\left( \frac{g_0^Te}{3}\right)e\\ &= y_0-\left( g_0-\left( \frac{g_0^Te}{3}\right)e\right)\\ &=y_0 - \Pi_B(g_0)\end{align}

Hence starting from the same $y_0\in A$, the two schemes will end up with the same $y_1$.

Hence, by induction, the two sequence would be identical.