Subset of matrix rows with half of column sums

400 Views Asked by At

Consider the following problem. We are given a matrix $A = (a_{ij})_{i,j = 1}^{m,n}$ with $m$ rows and $n$ columns, all $a_{ij}$ are nonnegative. Prove that there exists a subset $S$ of rows, $|S| \leq m/2 + n/2$, such that, for every column $j$, sum of all elements from $S$, that were in the column $j$, is at least a half of the sum of all elements in the whole column $j$. That is, if $S = \{i_1,\ldots,\ i_s\}$, $$\forall j = 1\ldots n: \qquad\sum_{k = 1}^sa_{i_kj} \geq \frac{1}{2}\sum_{i = 1}^ma_{ij}.$$

It is said that this problem is somehow connected to linear programming. I tried to consider the corresponding integral minimization problem and its LP relaxation, but still have no insight of the solution. Any advice appreciated.

1

There are 1 best solutions below

4
On BEST ANSWER

Ok, I think the following works so I will sketch out the idea. Maybe you can check the details?

We consider the following system: Let $C_1, \cdots, C_n$ be the $n$ column sums. Let $x_1, \cdots, x_m$ be $m$ variables, one for each row an consider the following constriants: \begin{align*} x_1 + \cdots + x_m &\le \frac{m}2 + \frac{n}2 \ \text{(Type 1)}\\ a_{1i}x_1 + a_{2i}x_2 + \cdots + a_{mi}x_m &\ge \frac{C_i}2, \ 1 \le i \le n \ \text{(Type 2)} \\ x_j &\ge 0, \ 1 \le j \le m \ \text{(Type 3)} \\ x_j &\le 1, \ 1 \le j \le m \ \text{(Type 4)} . \end{align*} Type $1$ is the relaxed version of saying that we cannot pick more than $\frac{m+n}2$ rows, and the type $2$ constraints are saying that for any column, the sum of the values of the entries in that column (times the corresponding row weight) have to be at least half the column sum. Ideally we would want $x_j \in \{0, 1\}$ but we relax this.

Now we note that the problem is easy if $n \ge m.$ This is because in this case, we can just take all of the rows! Thus, we can assume that $m > n$. If we put all of the constraints in a big matrix $B$, it will be $(1 + 2m + n )\times m$. Now I claim that the rank of $B$ is $m$. Indeed, the type $3$ constraints just give us a copy of the $m \times m$ identity matrix. Furthermore, the total rank of the type $2$ constraints is at most $n$.

Now let $x^*$ be any feasible solution to the system (for example, letting all of the $x_j = \frac{1}2$). From the above discussion, we have a $m-n \ge 1$ dimensional subspace of $\mathbb{R}^m$ that is orthogonal to all of the type $2$ constraints. Let $\delta$ be a vector from that subspace. We now want to consider $x' = x^* + c \delta$ for a suitable chosen scalar $c$. We do not know the dot product of $\delta$ with the all ones vector $\mathbf1 \in \mathbb{R}^m$ but we know that $\mathbf 1 \cdot \delta$ is the sum of $e_k \cdot \delta$ for the basis vectors $e_k$. (We are considering $\mathbf 1$ and $e_k$ since they are type $1$ and type $3$ constraints respectively!) Thus, we can pick $c$ such that $x'$ is also a solution to our system above and $x_{k} = 0$ or $1$ for some $k$ (essentially $x_{k}$ will be the first one to be tight as we change the value of $c$ due to our assumption. We just have to pick the sign of $c$ so that the type $1$ constraint isn't violated.)

Now we are almost done. There are two cases. If we have $x_k = 1$ then we can just remove the $k$th row and adjust the column sums accordingly. Furthermore, this only helps us with the type $1$ constraint since the LHS of the type $1$ constraint decreases by $1$ while the RHS only decreases by $\frac{1}2.$ This will result in a smaller instance of the problem which we can solve by induction.

In the other case, we continue in this manner by making more and more variables equal to $0$. However, everytime we make a variable equal to $0$, we must pick a new $\delta$ that is orthogonal to all the type $2$ constraints and all the type $3$ constraints that we just made tight. Thus, we can make some $m-n$ of the $x_j's$ equal to $0$ this way. Note that we cannot pick which of the $x_j's$ are $0$ since we cannot control the dot product $\mathbf 1 \cdot \delta$. In summary, we have a feasible solution $x'$ of the above system where $m-n$ of the variables are equal to $0$. Now just make the rest of the $n$ variables equal to $1$. This is possible since it only helps all of the type $2$ constraints since all of the coefficients of $A$ are non-negative. Furthermore, the type $1$ constraint is also satisfied since $$x_1 + \cdots + x_m = n < \frac{m}2 + \frac{n}2 $$ where the inequality follows from our assumption that $m > n$ and we are done!