Choquet's and Birkhoff's theorem for min-max discrete optimization problems

293 Views Asked by At

Some context first:

A very standard linear problem (LP) is to find $$\inf_P \left( \sum_{1 \leq i,j \leq n} P_{ij} C_{ij}^p \right)^\frac{1}{p}$$ under the constraint that $P$ is doubly stochastic (which forms a convex subset of $n \times n$ matrices).

This for example models a notion of optimal transport between two point clouds with $n$ points, where $C_{ij}$ represent a distance between points $i$ and $j$ and we optimize in the $p$-norm and $P_{ij}$ is the fraction of mass taken from point $i$ and transported to point $j$.

As this is a LP, Choquet's theorem states that some extremal points of the convex set of contraints set are optimal, and Birkhoff's theorem states that these are permutation matrices. It's nice, it means that our optimal "strategies" are one-to-one matchings (more precisely, there are one-to-one matchings that are optimal).

My first question: Consider $p \rightarrow +\infty$ ? The problem becomes $$\inf_P \left\{ \max_{i,j | P_{ij} > 0} C_{ij} \right\}$$ under the same constraint that $P$ is still doubly stochastic.

Basically, my question is: does the Choquet's theorem still hold? Then using Birkhoff theorem I can conclude that there are one-to-one matchings that are optimal.

It seems the result is true (at least the conclusion) by looking at remark 3.6 in http://users.jyu.fi/~peanju/preprints/maxtransport.pdf but the reference [17] given there is a 1926 paper and I fail to see the obvious link with our original problem.

My second question: what if we have countably infinitely many points in the $p = +\infty$ case? It seems the Birkhoff's theorem remains true for "infinite" doubly stochastic matrices https://arxiv.org/pdf/math/0512451.pdf , so once again it's all about the Choquet's theorem:

  • For finite $p$, is it true that in "infinite discrete" LP , some extremal points of the constraint polytope (here permutations) belong to the set of optimal solutions? (that might be well known I guess, just need a ref).
  • Does it remain true in the $p = +\infty$ case?

I am not expert in optimization/linear programming, so I may miss some terminology (I am pretty sure that problems of the type $\inf \max (..)$ do have a name...). Do not hesitate to give me some bibliography about these.

2

There are 2 best solutions below

1
On BEST ANSWER

Here is a result that holds for certain $C$ matrices.


Let $A$ be the set of all doubly stochastic matrices $(P_{ij})$ with countably infinite size. Let $B\subseteq A$ be the set of all countably infinite permutation matrices. Fix $C=(C_{ij})$ with $C_{ij}\geq 0$ for all $i,j \in \{1, 2, 3, ...\}$. We want to know if the following holds: $$ \inf_{P \in A} \sum_{ij} P_{ij}C_{ij} = \inf_{P \in B} \sum_{ij} P_{ij}C_{ij} \quad \mbox{(Eq. 1)} $$ Since $B \subseteq A$ we clearly always have: $$ \inf_{P \in A} \sum_{ij} P_{ij}C_{ij} \leq \inf_{P \in B} \sum_{ij} P_{ij}C_{ij} \quad \mbox{(Eq. 2)} $$

Claim:

The desired equality (1) holds whenever $(C_{ij})$ satisfies \begin{align} &\sum_{i=1}^{\infty} C_{ii} < \infty\\ &\lim_{a\rightarrow\infty} \sum_{i=1}^{a} C_{a+i,i} = 0\\ &\lim_{a\rightarrow\infty} \sum_{i=1}^{a} C_{i, a+i} = 0 \\ &\lim_{a\rightarrow\infty} \sum_{i=a+1}^{2a} \sum_{j=a+1}^{2a} C_{ij}=0 \end{align}

Proof:

Case 1: Suppose $\inf_{P \in A} \sum_{ij}P_{ij}C_{ij}=\infty$. Then (2) implies $\inf_{P \in B} \sum_{ij}P_{ij}C_{ij}=\infty$ and the result holds.

Case 2: Suppose $\inf_{P \in A} \sum_{ij}P_{ij}C_{ij}<\infty$. Define $M=\inf_{P \in A} \sum_{ij}P_{ij}C_{ij}$. By (2) it suffices to show that $$ \inf_{P \in B} \sum_{ij} P_{ij}C_{ij} \leq M$$ To this end, fix $\epsilon>0$. Then there exists a doubly stochastic matrix $P^*=(P_{ij}^*)$ that satisfies $$ \sum_{ij} P_{ij}^*C_{ij} \leq M + \epsilon \quad \mbox{Eq. (3)}$$ By the assumptions on matrix $C$, there is a positive integer $a$ such that \begin{align} &\sum_{i=a+1}^{\infty} C_{ii} \leq \epsilon \quad \mbox{Eq. (4)}\\ &\sum_{i=1}^{a} C_{a+i,i} \leq \epsilon \quad \mbox{Eq. (5)}\\ &\sum_{i=1}^{a} C_{i,a+i} \leq \epsilon \quad \mbox{Eq. (6)} \\ &\sum_{i=a+1}^{2a}\sum_{j=a+1}^{2a} C_{ij} \leq \epsilon \quad \mbox{Eq. (7)} \end{align} Define the $2a \times 2a$ matrix $R$ by \begin{matrix} P_{11}^* & P_{12}^* & P_{13}^* & \cdots & P_{1a}^* & * & 0 & 0& \cdots & 0\\ P_{21}^* & P_{22}^* & P_{23}^*&\cdots & P_{2a}^* & 0 & * & 0& \cdots & 0\\ P_{31}^* & P_{32}^* & P_{33}^* & \cdots & P_{3a}^* & 0 & 0 & * & \cdots & 0 \\ \cdots \\ P_{a1}^* & P_{a2}^* & P_{a3}^* & \cdots & P_{aa}^* & 0 & 0 & 0 & \cdots & *\\ * & 0 & 0 & \cdots & 0 & * & * & * & \cdots & * \\ 0 & * & 0 & \cdots & 0 & * & * & * & \cdots & * \\ 0 & 0& * & \cdots & 0 & * & * & * & \cdots & * \\ \cdots \\ 0 & 0& 0 & \cdots & * & * & * & *& \cdots & * \\ \end{matrix} We can fill in the * entries greedily to ensure that $R=(R_{ij})$ is a doubly stochastic matrix of size $2a \times 2a$. It follows that $R$ is a convex combination of $2a \times 2a$ permutation matrices. In particular, there must be a permutation matrix $G$ of size $2a \times 2a$ such that $$ \sum_{i=1}^{2a}\sum_{j=1}^{2a} G_{ij} C_{ij} \leq \sum_{i=1}^{2a}\sum_{j=1}^{2a} R_{ij}C_{ij} \quad \mbox{(Eq. 8)} $$ Now define $Z$ as the countably infinite permutation matrix that contains $G$ in the upper left block, and the identity map in the lower right block: $$ Z = \left[\begin{matrix} G & 0 \\ 0 & I \end{matrix} \right]$$ Since $Z \in B$ we have \begin{align} \inf_{P \in B} \sum_{ij} P_{ij} C_{ij} &\leq \sum_{ij} Z_{ij}C_{ij} \\ &\overset{(a)}{=} \sum_{i=1}^{2a} \sum_{j=1}^{2a} G_{ij}C_{ij} + \sum_{i=2a+1}^{\infty} C_{ii} \\ &\overset{(b)}{\leq} \sum_{i=1}^{2a} \sum_{j=1}^{2a} R_{ij}C_{ij} + \sum_{i=2a+1}^{\infty} C_{ii} \\ &= \sum_{i=1}^a\sum_{j=1}^a P_{ij}^* C_{ij} + \sum_{i=1}^{a} R_{i,a+i}C_{i,a+i} + \sum_{i=1}^{a} R_{a+i, i} C_{a+i,i} + \sum_{i=a+1}^{2a}\sum_{j=a+1}^{2a} R_{ij}C_{ij} + \sum_{i=2a+1}^{\infty} C_{ii} \\ &\overset{(c)}{\leq} \sum_{i=1}^{\infty}\sum_{j=1}^{\infty} P_{ij}^*C_{ij} + \sum_{i=1}^{a}C_{i,a+i} + \sum_{i=1}^a C_{a+i,i} + \sum_{i=a+1}^{2a}\sum_{j=a+1}^{2a} C_{ij}+ \sum_{i=2a+1}^{\infty} C_{ii} \\ &\overset{(d)}{\leq} M + 5\epsilon \end{align} where (a) holds by definition of $Z$; (b) holds by (8); (c) holds because $0\leq R_{ij}\leq 1$, $P_{ij}^*\geq 0$, $C_{ij}\geq 0$ for all $i, j$; (d) holds by (3)-(7). Hence, for all $\epsilon>0$ we have $$ \inf_{P \in B} \sum_{ij} P_{ij}C_{ij} \leq M + 5\epsilon $$ It follows that $$ \inf_{P \in B} \sum_{ij} P_{ij}C_{ij} \leq M $$ $\Box$

10
On

This response gives some related results that can be used to answer the first question (finite-sized matrices) and gives (limited) insight on the second (infinite-sized matrices).


Fix a matrix size for $(C_{ij})$ (possibly countably infinite). Assume $C_{ij} \geq 0$ for all $(i,j)$.

Let $A$ be the set of doubly-stochastic matrices $(P_{ij})$ with the same size as $(C_{ij})$. Let $\tilde{A}\subseteq A$ be a set of doubly stochastic matrices such that any matrix $P \in A$ can be formed as a finite or countably infinite probabilistic mixture of matrices in $\tilde{A}$. That is, every $P \in A$ can be written $$ P = \sum_{k=1}^{\infty} \theta^{(k)}P^{(k)}$$ where $\theta_1, \theta_2, …$ are some nonnegative values that sum to 1, and $P^{(k)} \in \tilde{A}$ for all $k \in \{1, 2, 3, …\}$.

For example, if $A$ is the set of all $n \times n$ doubly stochastic matrices (finite-sized matrices), then we can take $\tilde{A}$ as the set of all $n \times n$ permutation matrices.

$p<\infty$ problem:

We can always change the order of a finite or countably infinite sum of nonnegative numbers. So for any $P \in A$ we have $$ P = \sum_{k=1}^{\infty} \theta^{(k)}P^{(k)} $$ so \begin{align*} \sum_{ij} P_{ij}C_{ij}^p &= \sum_{ij}(\sum_{k=1}^{\infty} \theta^{(k)}P_{ij}^{(k)})C_{ij}^p \\ &= \sum_{k=1}^{\infty} \theta^{(k)}\left[ \sum_{ij} P_{ij}^{(k)}C_{ij}^p \right] \\ &\geq \inf_{k \in \{1, 2, 3, …\}} [\sum_{ij}P_{ij}^{(k)}C_{ij}^p] \\ &\geq \inf_{P \in \tilde{A}} [\sum_{ij}P_{ij}C_{ij}^p] \end{align*} Taking the $1/p$ power and then the infimum of both sides gives: $$ \inf_{P \in A} [\sum_{ij} P_{ij}C_{ij}^p ]^{1/p}\geq \inf_{P \in \tilde{A}} [\sum_{ij}P_{ij}C_{ij}^p]^{1/p} $$ On the other hand, since $A\supseteq \tilde{A}$ the reverse inequality holds and so $$ \inf_{P \in A} [\sum_{ij} P_{ij}C_{ij}^p ]^{1/p}= \inf_{P \in \tilde{A}} [\sum_{ij}P_{ij}C_{ij}^p]^{1/p} $$ So we can restrict attention to doubly stochastic matrices in the set $\tilde{A}$. Further, if the infimum is achievable by a matrix $P^* \in A$, then a similar argument implies that it is also achievable by a matrix $\tilde{P}^* \in \tilde{A}$.

Case of $p=\infty$:

If $P \in A$ and $P = \sum_{k=1}^{\infty} \theta^{(k)} P^{(k)}$ for some matrices $P^{(k)} \in \tilde{A}$ and some nonnegative weights that sum to $1$ then for each $k \in \{1, 2, 3, ...\}$ such that $\theta^{(k)}>0$ we have: $$ \{(i,j) : P_{ij}>0\} \supseteq \{(i,j) : P_{ij}^{(k)}>0\} \implies \sup_{(i,j) : P_{ij}>0} C_{ij} \geq \sup_{(i,j) : P_{ij}^{(k)}>0} C_{ij} $$ so we can use this to again conclude $$ \inf_{P \in A} [ \sup_{(i,j):P_{ij}>0} C_{ij}] = \inf_{P \in \tilde{A}} [ \sup_{(i,j):P_{ij}>0} C_{ij}] $$

Can we let $\tilde{A}$ be the set of all permutation matrices?

For finite-sized matrices, the answer is "yes" since the convex hull of the set of all $n\times n$ permutation matrices is the set of all $n \times n$ doubly stochastic matrices.

It turns out that, for countably-infinite sized matrices the answer is "no": There are doubly stochastic matrices $P=(P_{ij})$ that cannot be written as $P=\sum_{k=1}^{\infty} \theta^{(k)}P^{(k)}$ for some nonnegative values $\theta^{(k)}$ that sum to 1 and some permutation matrices $P^{(k)}$. To see why, suppose we have a matrix $(R_{ij})$ of the form $(R_{ij})=\sum_{k=1}^{\infty} \theta^{(k)}P^{(k)}$. Fix an index $m$ such that $\theta^{(m)}>0$. Then $(R_{ij}) \geq \theta^{(m)}P^{(m)}$, with inequality taken entrywise. Since $P^{(m)}$ is a permutation matrix, each row of $P^{(m)}$ contains exactly one $1$ entry. Thus, each row of $(R_{ij})$ has an entry with value at least $\theta^{(m)}$. So $\inf_{i} [\sup_j R_{ij}] \geq \theta^{(m)}>0$. But we can find doubly-stochastic matrices such that $\inf_{i} [\sup_j R_{ij}] =0$ (and so they cannot be written in the desired form). Indeed consider: \begin{matrix} \frac{1}{2} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 & \cdots \\ 0 & 0 & \frac{1}{2} & \frac{1}{2} & 0 & 0 & 0 & 0& \cdots\\ \frac{1}{4} & \frac{1}{4} & \frac{1}{4} & \frac{1}{4} & 0 & 0 & 0 & 0& \cdots\\ 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{1}{2} & 0 & 0 & \cdots\\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{1}{2} & \cdots\\ 0 & 0 & 0 & 0 & \frac{1}{4} & \frac{1}{4} & \frac{1}{4} &\frac{1}{4}& \cdots\\ \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8}& \cdots \end{matrix}

So for each integer $n>0$, there exists a row that has $1/2^{n}$ on the first $2^n$ entries, and 0s after:
$$ [\underbrace{\frac{1}{2^n} \: \frac{1}{2^n} \: \cdots \: \frac{1}{2^n}}_{2^n \mbox{ times}} \: 0 \: 0 \: 0 ...]$$ So we can find rows where the max entry in that row is arbitrarily small.

Case of maximizing over $P \in A$ (not minimizing)

For the case when $(C_{ij})$ has infinite size, infinite permutations are not necessarily optimal for maximizing. Let $Q=(Q_{ij})$ be any doubly stochastic matrix with an infinite number of nonzeros on the first column, such as the matrix drawn in the previous section above.

Consider $p=1$ and define $(C_{ij})$ for $i,j \in \{1, 2, 3, ...\}$ by $$ C_{ij} = \left\{ \begin{array}{ll} 1/Q_{i1} &\mbox{ if $j=1$ and $Q_{i1}>0$} \\ 0 & \mbox{ else} \end{array} \right.$$ So the only nonzeros of $(C_{ij})$ are on the first column. Then any "infinite permutation" matrix $P$ must choose $P_{i1}=1$ for exactly one $i \in \{1, 2, 3, ...\}$ and hence results in a finite value for $\sum_{i,j} P_{ij} C_{ij}$. On the other hand, for the doubly stochastic matrix $Q$ we have $\sum_{i,j} Q_{ij} C_{ij} = \sum_{i : Q_{i1}>0} 1 = \infty$. So the maximum sum is $\infty$, which cannot be achieved by any permutation matrix.