How do you prove that $\{ Ax \mid x \geq 0 \}$ is closed?

6k Views Asked by At

Let $A$ be a real $m \times n$ matrix.

How do you prove that $\{ Ax \mid x \geq 0, x \in \mathbb R^n \}$ is closed (as in, contains all its limit points)?

The inequality $x \geq 0$ is interpreted component-wise.

This fact is used in some proofs of Farkas's lemma. It seems like it should be easy, but the proof I've seen seems to be unexpectedly complicated. Is there a very clear / easy / obvious proof of this fact?

(Note that linear transformations do not always map closed sets to closed sets, as discussed in this question. For example, let $S = \{ (x,y) \in \mathbb R^2 \mid y \geq e^x \}$ and let $T:\mathbb R^2 \to \mathbb R^2$ such that $T(x,y) = (0,y)$. Then $S$ is closed, but $T(S)$ is not closed.)

Edit: Here is a simple proof in the case where $A$ has full column rank. (A very similar proof is given in Nocedal and Wright, in the Notes and References at the end of chapter 12.) Let $y^*$ be a limit point of $\Omega = \{ Ax \mid x \geq 0, x \in \mathbb R^n \}$. There exists a sequence $(x_i)_{i=1}^\infty$ of points in $\mathbb R^n$ such that $x_i \geq 0$ for all $i$ and $A x_i \to y^*$ as $i \to \infty$. Let $B$ be a left inverse for $A$. Then $B A x_i \to B y^*$ as $i \to \infty$. In other words, $x_i \to x^*$ as $i \to \infty$, where we have defined $x^* = B y^*$. Clearly, $x^* \geq 0$ and $A x^* = y^*$. This shows that $y^* \in \Omega$.

(Alternatively, you could just note that if $A$ has full column rank then the mapping $x \mapsto Ax$ is a homeomorphism between $\mathbb R^n$ and $R(A)$, so it maps closed sets to closed sets. This shows that $\Omega$ is a closed subset of $R(A)$, and it follows that $\Omega$ is a closed subset of $\mathbb R^m$.)

6

There are 6 best solutions below

7
On BEST ANSWER

We denote by $a_i \in \mathbb R^m$, $i = 1, \ldots, n$ the columns of $A$. By a conic variant of Carathéodory's theorem, each conic combination of $\{a_i\}$ can be written as a conic combination of a linearly independent subset of $\{a_i\}$. Since there are only finitely many linearly independent subsets of $\{a_i\}$, it is sufficient to prove the claim for matrices $A$ which have full column rank (i.e., all columns are linearly independent). But in this case, the claim is easy to establish.

5
On

Let $e_1,...,e_n$ be a basis of $R^n$, you can suppose that $A(e_1),...,A(e_l)$ generate the image of $A$, and $(e_{l+1},...,e_n)$ is a basis of $ker(A)$.

if $y$ is an element of the adherence of $C=\{A(x),x\geq 0\}$, $y$ is in the vector subspace generated by $(A(e_1),...,A(e_l)$. You can write $y=y_1A(e_1)+...+y_lA(e_l)$. There exists $(x_m)=A(x_1^me_1+...x^m_ne_n)=x_1^mA(e_1)+...+x_m^lA(e_l)$ which converges towards $y$. This implies that $x_i^m$ converges towards $y_i, i=1,...,l$ (use $\|\|_{\infty}$ for that) this implies that $A(y_1e_1+...+y_le_l)=y$.

4
On

OK, after struggling with elementary tools for a while and in vain, I had to invoke the "Closed Map Lemma", namely that a proper map (i.e one for which pre-images of compact sets are compact) between locally compact Hausdorff spaces (i.e a space in which every point has a compact neighborhood) is closed. For example, see Theorem 2.6 of this paper.

In your case, assuming WLOG that $A$ has full column rank (see @gerw's remark above), the linear operator $A: x \mapsto Ax$ is a proper mapping between the (trivially) locally compact Hausdorff spaces $\mathbb R^n$ and $\mathbb R^m$, and the $n$th nonnegative orthant $\mathbb R_+^n = [0,\infty)^n$ is a closed subset of the former space. This proves that $\{Ax \text{ s.t } x \ge 0\} = A \mathbb R_+^n$ is closed in $\mathbb R^m$, and we're done.

0
On

The cone $C=\{y\colon Ax=y,x\ge 0\}$ is finitely generated (by finitely many columns of $A$) and convex. By Minkowsky-Weyl theorem (en easy proof via Fourier-Motzkin eliminations can be found here, Theorem 1) it is a polyhedral cone, that is, $C=\{y\colon By\le 0\}$. From the last representation it is clear that $C$ is closed as an intersection of closed half-spaces.

1
On

Here is a proof that I learned from the "Notes and References" section at the end of chapter 12 of Numerical Optimization by Nocedal and Wright. The book attributes the proof to R. Byrd. It is quite similar to the proof outlined by gerw, but not as short. This proof does not explicitly invoke Caratheodory's theorem, but it uses the same trick that is used to prove Caratheodory's theorem.


First I'll give a simple proof for the special case where $A$ has full column rank. Let $y^*$ be a limit point of $\Omega = \{ Ax \mid x \geq 0, x \in \mathbb R^n \}$ and let $(y_i)_{i=1}^\infty$ be a sequence of points in $\Omega$ converging to $y^*$. For each $i$, there exists $x_i \geq 0$ such that $$ y_i = A x_i. $$ Let $L$ be a left inverse for $A$, and notice that $x_i = L y_i \to L y^* = x^*$ as $i \to \infty$, where we have defined $x^* = L y^*$. Clearly $x^* \geq 0$ and $A x^* = y^*$. This shows that $y^* \in \Omega$. So $\Omega$ is closed.


The proof for this special case gives us a clue about how to handle the general case, which we now attempt. In this section we no longer assume that $A$ has full column rank.

Again suppose that $y^*$ is a limit point of $\Omega$ and let $(y_i)_{i=1}^\infty$ be a sequence of points in $\Omega$ converging to $y^*$. For each $i$, there exists a vector $x_i \geq 0$ such that $y_i = A x_i$. Fact: We can write $y_i = \tilde A_i \tilde x_i$, where $\tilde A_i$ is a matrix with full column rank that is obtained by deleting some columns of $A$, and $\tilde x_i \geq 0$. We'll prove this fact later. For now, we proceed by noticing that there are only finitely many possibilities for the matrix $\tilde A_i$, so at least one of these possibilities must occur infinitely often. Thus, the sequence $(y_i)_{i=1}^\infty$ has a subsequence $(y_{i_k})_{k=1}^\infty$ such that the matrices $\tilde A_{i_1},\tilde A_{i_2},\ldots$ are all identical. Let $\tilde A = \tilde A_{i_1}$, so that $$ y_{i_k} = \tilde A \tilde x_{i_k} $$ for all $k$.

We are almost done. Let $L$ be a left inverse for $\tilde A$, and notice that $$ \tilde x_{i_k} = L y_{i_k} \to L y^* = \tilde x^* \text{ as } k \to \infty, $$ where we have defined $\tilde x^* = L y^*$. Clearly, $\tilde x^* \geq 0$ and $y^* = \tilde A \tilde x^*$. Insert zeros into $\tilde x^*$ as needed to obtain a vector $x^* \geq 0$ satisfying $$ y^* = A x^*. $$ This shows that $y^* \in \Omega$.

We are now done with the proof, except that we have not yet proved the fact mentioned above. We do this next.

Delete as many columns of $A$ as possible subject to the restriction that $y_i$ must be a conical combination of the columns of the resulting matrix, which we shall call $\tilde A_i$. There exists a vector $\tilde x_i \geq 0$ such that $y_i = \tilde A_i \tilde x_i$. Clearly each component of $\tilde x_i$ is positive; otherwise, we did not delete as many columns of $A$ as possible. Suppose (for a contradiction) that $\tilde A_i$ has a nonzero null vector $w$. Starting at $t = 0$ adjust the value of $t$ slowly until one (or at least one) of the components of $\tilde x_i + t w$ is equal to $0$, then stop. Then $\tilde A_i (\tilde x_i + t w) = y_i$ and $\tilde x_i + t w \geq 0$. It would be possible to delete a column of $\tilde A_i$ (corresponding to a zero component of $\tilde x_i + t w$) and still have $y_i$ be a conical combination of the reduced matrix. That is a contradiction. This shows that $\tilde A_i$ has full column rank.


Comments:

  • This theorem would be trivial if it were true that linear transformations map closed sets to closed sets. Unfortunately, that is false. But in the case where $A$ has linearly independent columns, you can show (very easily) that the function $x \mapsto Ax$ is a homeomorphism between $\mathbb R^n$ and $R(A)$, which means that $A$ really does map closed sets to closed sets. So, that case is easy.
0
On

This is my attempt to fill in some details in the answer provided by @gerw. (I'm writing an answer to my own question.)

Let $S$ be the set of all matrices $B$ with linearly independent columns such that each column of $B$ is a column of $A$.

The "conic version" of Caratheodory's theorem (see below) implies that if $y = Ax$ for some vector $x \geq 0$, then $y$ can be expressed as $y = B \tilde x$ for some matrix $B \in S$ and some vector $\tilde x \geq 0$.

Thus, $$ \{ Ax \mid x \geq 0, x \in \mathbb R^n \} = \cup_{B \in S} \{ B \tilde x \mid \tilde x \geq 0 \}. $$

Trivially, each set $\{B\tilde x \mid \tilde x \geq 0 \}$ is closed. (Explanation: you can easily check that if $B \in \mathbb R^{m \times k}$ has linearly independent columns then $B$ provides a homeomorphism between $\mathbb R^k$ and $R(B)$, so $B$ maps closed sets to closed sets.)

We see that $\{ Ax \mid x \geq 0 \}$ is a finite union of closed sets. So, $\{ Ax \mid x \geq 0 \}$ is closed. The proof is now complete.


Comments:

  • If you do not wish to invoke Caratheodory's theorem, you can prove the relevant assertion directly using the following simple strategy: if $y = Ax$ with $x \geq 0$ and $A$ has a nontrivial null vector $v$, you can replace $x$ with $\hat x = x + \lambda v$ where $\lambda$ is chosen so that $\hat x \geq 0$ and $\hat x$ has a component that is $0$. The corresponding column of $A$ can be deleted. Now repeat this process, deleting columns one by one until we arrive at a matrix with linearly independent columns. But at this point, we have essentially proved Caratheodory's theorem.
  • The "conic version" of Caratheodory's theorem is not an obscure variant of Caratheodory's theorem, but rather is so important and fundamental that Bertsekas includes it in the statement of Caratheodory's theorem. Here is Caratheodory's theorem as it appears in Proposition 1.2.1 (p. 20) of Convex Optimization Theory by Bertsekas.

Proposition 1.2.1: (Caratheodory's Theorem) Let $X$ be a non-empty subset of $\mathbb R^m$. Then:

  1. Every nonzero vector from $\text{cone}(X)$ can be represented as a positive combination of linearly independent vectors from $X$.
  2. Every vector from $\text{conv}(X)$ can be represented as a convex combination of no more than $m + 1$ bectors from $X$.

Part 1 is the "conic version" of Caratheodory's theorem that I referred to above. You see how to prove it easily using the strategy from the first bullet point.