A large independent set in a bipartite graph that is more "balanced"

46 Views Asked by At

Let $G$ be a bipartite graph with two parts $U$ and $V$, and the maximum degree of $G$ is $d$.

I am wondering is there a way to find a large independent set $I$ of G that is more "balanced" than just taking the larger one out of $U$ and $V$? "Balance" means $I$ has some vertices from $U$ and some from $V$. "Large" means $\max(|U|, |V|)-|I|=O(d)$.

Edited: To be more specific of "balanced", here we may assume $|U|=|V|$ and is much larger than $d$. $I$ is balanced means $|I\cap U|/|I\cap V|=\Theta(1)$, i.e., $c\preceq|I\cap U|/|I\cap V|\preceq C$ for some constants $c,C>0$. And $I$ should be large, which means $|U|-|I|= O(d)$.

2

There are 2 best solutions below

0
On

You can solve the problem via integer linear programming as follows. Let binary decision variables $x_u$ and $y_v$ indicate whether node $u\in U$ or node $v\in V$ is selected, respectively. The problem is to maximize $\sum_u x_u+\sum_v y_v$ subject to linear constraints \begin{align} x_u+y_v &\le 1 &&\text{for $(u,v)\in E$} \\ c \sum_v y_v \le \sum_u x_u &\le C \sum_v y_v \end{align}

0
On

First of all: most of the time, we should not expect such a set to exist. (We can quantify this by saying that in a random bipartite graph of this type, as the number of vertices tends to infinity, the probability that $I$ exists tends to $0$.) This would take some work to prove, but intuitively, there are $2^{O(n)}$ possibilities for $I$, and each one has a probability of $2^{-O(n^2)}$ of being independent.

The good news is that finding the largest independent set in a bipartite graph is a tractable problem. The complement of $I$ is a vertex cover; finding the minimum vertex cover can be done by solving a maximum-flow problem. (The technical details: add a source vertex $s$ with a capacity-$1$ arc to each vertex in $U$; turn all edges of $G$ into unbounded capacity arcs from $U$ to $V$; add a sink vertex $t$ with a capacity-$1$ arc from each vertex in $V$. If $(S,T)$ is a minimum cut corresponding to the maximum flow in this network, then there can be no edges from $S \cap U$ to $T \cap V$, which means we can take $I = (S \cap U) \cup (T \cap V)$ to be our independent set.)

This works great if it turns out that $I$ is actually larger than $U$ and $V$, but if it's smaller, then this procedure will not find it.


Here is a heuristic suggestion for how to find a balanced $I$ of size $|U|-O(d)$, if it exists.

Modify the maximum flow problem above to give the arcs from $s$ to $U$ capacities of $1+\epsilon, 1+2\epsilon, \dots, 1 + |U|\epsilon$, in random order, for some $\epsilon>0$ we'll worry about later. Do the same thing for arcs from $V$ to $t$. Again, find the maximum flow and the corresponding independent set.

There will always be two cuts with value $|U| + \binom{|U|+1}{2} \epsilon$, which correspond to the independent sets $U$ and $V$; the random ordering of the capacities does not affect this value. There will also be some cuts that are slight modifications of these, whose value will be slightly random, with expected value above $|U| + \binom{|U|+1}{2} \epsilon$. Depending on $\epsilon$, the best of these might be slightly better than the trivial cut.

Assuming that there is a large and balanced independent set $I$, the corresponding cut $(S,T)$ will have a slightly higher expected value: a factor of $(1 + O(\frac{d}{|U|}))$ greater. However, because $I$ is balanced, the cut $(S,T)$ should have a high variance: about $O(\epsilon^2 |U|^3)$.

If we pick $\epsilon = \Omega(d/|U|^{3/2})$, then we estimate that a constant fraction of the time, the weighted vertex cover that gives $I$ will win. So we can try repeating this process multiple times with different random modifications each time, and see if we obtain any interesting independent set.