Finding large solution-free sets

66 Views Asked by At

(posted on Stack Overflow, but was suggested to post here instead)

Is there any nice way to check if a given set has any (nontrivial) solutions to a fixed linear equation?

My goal is to find the size of the largest solution-free subset of $\{1, 2, ..., n\}$ for the linear equation $x + 3y = 2z + 2w$. I'd like to find for some $n$ a solution-free set of size greater than $\sqrt{8n}$. It's worth noting that a greedy approach isn't going to work and that the only way to greedily generate such a solution-free set is if you already have such a set.

As an example of what I'm talking about, the set $A = \{1, 5, 6, 10\}$ is solution-free because the only solutions are the trivial ones (i.e., where $x=y=z=w$). On the other hand, the set $A = \{1, 2, 3\}$ is not solution-free, because $3 + 3\cdot1 = 2\cdot2 + 2\cdot1$ (which amounts to setting $x=3$, $y=w=1$, and $z=2$).

Approach 1

I've tried a recursive (is that the word?) approach. In particular, any subset of a solution-free set is solution-free as well. So I have the set $T_{n-1}$ of all solution-free subsets of $\{1, 2, ..., n-1\}$, then for each $T\in T_{n-1}$, I consider $T + \{n\}$. Then $T_n$ is the union of $T_{n-1}$ and the set of all solution-free $T + \{n\}$.

Unfortunately, the $T_n$'s get really big really fast, so I was barely able to check to 50. If I omit some of the smaller sets of $T_n$ (so I only save the ones with, say, more than 7 elements—which means that as I continue, I've lost a lot of data, so I won't be getting optimal sets anymore), I can reach about 150, but after that, my computer kind of just breaks down.

Approach 2

The other way I have tried is to consider the hypergraph whose edges are the (minimal) solutions to this equation. So, for example, the set $\{1, 3, 5\}$ would be an edge because $1 + 3\cdot5 = 2\cdot3 + 2\cdot5$, but $\{1, 3, 5, 6\}$ wouldn't be, even though $5 + 3\cdot3 = 2\cdot1 + 2\cdot6$ is a solution. The set $\{1, 2, 5, 6\}$ would be an edge as well, since no subset of it forms a solution, and it is itself a solution to the equation (namely, $1 + 3\cdot5 = 2\cdot2 + 2\cdot6$). Then it suffices to find the independence number of this hypergraph. Unfortunately, it's not linear/simple (i.e., there exist edges who share more than two vertices) and it's not uniform (there are edges of size 3 and 4). This means that most of the literature about hypergraph independence numbers doesn't apply.

The single thing I could find that would apply is Theorem 3 of this article. But this is for general hypergraphs, and I don't think it gives a good enough bound for this hypergraph. Maybe it does, but my code was messed up anyway, so I'm just not sure. If anyone has a good way to implement this, I'd be interested in hearing!

Otherwise, I'm just left trying to find a good way to use the structure of the hypergraph to find the size of the maximal independent set of it. I'm not super familiar with ways to do this, does anyone have any ideas?

Approach 3

My final method was to find all solutions in $\{1, 2, ..., n\}$ to $x + 3y = d$ and $2z + 2w = d$ for each even $d\in[4,4n]$, inclusive. Let $A_d$ be the set of solutions to $x + 3y = d$ and let $B_d$ be the set of solutions to $2z + 2w = d$. Then what we're looking for is just a large set that avoids every non-singleton of the Cartesian product $A_d \times B_d$, for every possible $d$. I don't know good ways, however, to find large sets that avoid a certain family of sets. Any ideas?

Sorry—I know this is long!

1

There are 1 best solutions below

0
On

You can solve the problem via integer linear programming as follows. For $i\in\{1,\dots,n\}$, let binary decision variable $s_i$ indicate whether $i$ is selected. For each nontrivial solution $(x,y,z,w)$, impose a "no-good" constraint that excludes that solution: $$s_x+s_y+s_z+s_w \le 3 \tag1$$ The problem is to maximize $\sum_{i=1}^n s_i$ subject to $(1)$.

As in your Approach 2, you can restrict the constraints to minimal solutions because the non-minimal constraints are redundant.

For $n\le 50$, the maximum exceeds $\sqrt{8n}/2$ only once (maximum is 4 for $n=7$). What is your motivation to exceed $\sqrt{8n}$?

enter image description here