Find $\ell_2$ shortest vector in simplex (related to simplex projection)

180 Views Asked by At

Suppose that I have an arbitrary $n$-dimensional simplex in a real vector space, and I want to find the $\ell_2$-shortest vector in the simplex. Is there some simple way to do so?

There are two equivalent ways to characterize this problem: as a minimization of an arbitrary positive-definite quadratic on the "canonical simplex" $C_n$ that is the convex hull of $(1,0,0,...,0), (0,1,0,...,0), ..., (0,0,0,...,1)$, or as a minimization of the standard $\ell_2$ norm on an arbitrary simplex which is a simple linear transformation of the above.

I believe this is also equivalent to computing an oblique projection onto the canonical simplex. This paper is often cited for orthogonal projections onto the simplex, but I'm not certain if you can use the technique for oblique projections as well.

These are the two versions of this:

1. Finding the shortest $\ell_2$ norm on an arbitrary simplex

In this version, we look at an arbitrary simplex $\Delta_n$ in $\Bbb R^n$, which is some linear transformation of $C_n$. We want to find the $\ell_2$ shortest vector in the simplex.

As a first pass, we can look at the affine subspace $A_n$ that is the affine span of $\Delta_n$. Then, we can look for the shortest vector in that subspace. This is a fairly standard use of the Moore-Penrose pseudoinverse, and the answer is given by

$\text{argmin } ||v||: v \in A_n = (I_n - M^\dagger M)\cdot a$

where $M$ is any matrix whose columns span the unique subspace that is parallel to $A_n$ and of same dimension, $M\dagger$ is the pseudoinverse of $M$, and $a$ is any vector in $A_n$. This is basically a orthogonal projection matrix that maps every vector in $A_n$ to the same point, which is the unique shortest vector in $A_n$.

The issue is, this shortest vector need not also be in the simplex $S_n$! So we would need some way to then do a second projection onto the nearest face/edge/etc of the simplex. We could do this as a projection onto the affine subspace spanned by that face, and then repeat until we are in the simplex. But how do we identify the nearest face?

This naturally leads to the second version of the problem:

2. Minimization of positive-definite quadratic form on canonical simplex

Basically, we can change the coordinate system of the last version of the problem so that the simplex is always the canonical simplex $C_n$, with vertices given by $(1,0,0,...0), (0,1,0,...)$, and so on. If we do this, then the inner product is no longer the standard Euclidean inner product, but will instead have some transformed unit ball. This can be thought of as a positive-definite quadratic form that we want to minimize on the canonical simplex.

After changing the coordinate system, our new inner product will be determined entirely by some symmetric matrix $S$:

$Q(v) = v^T S v = \frac{1}{2} v^T H v$

where $H$ is the Hessian for the quadratic form, which is given in closed form simply as $H = 2S$.

We then want to get the following minimum:

$\text{argmin } v^T S v: v \in C_n$

We can easily use the technique from #1 to get the related vector

$\text{argmin } v^T S v: v \in A_n$

where $A_n$ is the affine span of $C_n$.

However, as in #1, this vector need not be on the canonical simplex, just the affine span of it. To get the shortest vector, we need to find the shortest path back to the simplex.

My thought

Since this is a simple, positive-definite quadratic form, we have literally the entire Hessian (which is the same everywhere), and can get the global minimum on $A_n$ easily using the pseudoinverse. Then, the only issue is that we need to find the shortest path back to the simplex. It seems as though we need to look at the path of "least steep ascent" or something, but also make sure it is in a direction that is guaranteed to intersect with the simplex.

So we have a simple closed form for the Hessian, all the gradients, and the simplex. There are many convex minimization routines that use the Hessian, and this is literally the simplest Hessian there is - a positive-definite quadratic. How can we use this information to look for the path of least steep ascent in the direction of the simplex?