Why is it so difficult to prove $\mathbf{NP} ≠ \mathbf{P}$?

1.2k Views Asked by At

I'm just curious because I saw on Wikipedia a single polynomial time solution to any NP-hard problem would imply there are polynomial time solutions for every single NP problem.

Also I assume there are a lot of NP problems... So does that mean no one has ever been able to prove for any NP problem that no polynomial time algorithms exist? With how many NP problems there are it seems it would provide a lot of opportunities to try right? And a single proof for anyone of them would show $\mathbf{NP} ≠ \mathbf{P}$ right?

3

There are 3 best solutions below

1
On BEST ANSWER

It is actually worse than that. If you can prove that any problem in NP lacks a polynomial time solution, then that by itself implies that $P \neq NP$, because the defining property of P is that all problems in P can be solved in polynomial time, and you would have a counterexample, a problem in NP that is not in P.

The real problem here is that lower bounds on complexity are generally hard to prove. The "easy" way to prove a lower bound is to establish that an algorithm must necessarily consider some set of data points or values, or else it cannot get the right answer. To use a simpler example, we can prove that any (correct) matrix multiplication algorithm, given two square $n \times n$ matrices, must take at least $\Omega(n^2)$ time to run, because you have to examine every entry in both matrices at least once.

At the same time, we don't actually know of any $\Theta(n^2)$ algorithm to multiply matrices, and it is suspected that no such algorithm exists. For a long time it was believed that the lower bound was $\Omega(n^3)$, which is the running time of the "textbook" algorithm that you would normally learn in introductory linear algebra, but then it was repeatedly demonstrated that faster algorithms exist. Currently, we don't know how fast you can multiply two matrices. NP problems are like that, but even harder to reason about.

In a typical NP-complete problem (such as the traveling salesman problem), you have a set of objects (here, cities or graph vertices, and roads or edges), and you can "verify" a given solution by some polynomial-time process (here, by verifying that all of the edges given as part of the solution form a valid Hamiltonian cycle, and then adding up their weights). In order to be correct, a TSP solution would at least need to examine all of the objects which are considered by that polynomial-time verification algorithm (all of the vertices and edges). But that just establishes that the solving algorithm can't be faster than the verification algorithm, which is not a particularly surprising or useful insight, because the verification algorithm is already polynomial-time.

NP-complete problems should be the hardest problems in NP, which is why we tend to focus on them. If you cannot prove that an NP-complete problem is not in P, then it's (probably) going to be even harder to prove that any random other problem in NP is not in P.

2
On

Yes, finding a single polynomial-time algorithm for any NP-complete problem would prove P = NP, and proving there’s no polynomial-time algorithm for a specific NP-complete problem would prove P ≠ NP.

To explain some of the obstacles we’ve encountered that partially explain why this seems to be so hard to resolve:

  1. Many proof techniques that separate other complexity classes don’t work here. For example, proofs like the ones that show the halting problem is undecidable, which work based purely on the existence of a universal Turing machine and of self-reference, can be shown not to be able to decide whether P = NP. (Look up the Baker-Gill-Solovay theorem for more on this.)

  2. Some other proof approaches that seem promising, such as “natural proofs,” are not strong enough to resolve P versus NP. (Search for “natural proofs” for more on this.)

  3. P versus NP is a special case of a variety of much broader problems. For example, if P = NP, then the polynomial hierarchy collapses down to P. This hierarchy contains problems that, intuitively speaking, seem much harder than the ones in NP. And yet we seem to be at a roadblock in resolving the complexity of any two layers in this hierarchy.

None of these are definitive reasons why the problem has eluded a solution for so long, but they hopefully give a sense of why this is so tricky.

0
On

Adding on to @templatetypedef's answer, there is a third barrier: Algebrization (https://www.scottaaronson.com/papers/alg.pdf). One technique in Circuit Complexity and Interactive Proofs is to convert Boolean formulas to polynomials, and then reason about the probabilistic properties of those polynomials with respect to the functions they approximate. Algebrization effectively says that this arithmetization approach, on its own or even in tandem with diagonilization, is not powerful enough to resolve $\textsf{P}$ vs. $\textsf{NP}$.

The Geometric Complexity Theory (GCT) program is considered a possible viable way to attack complexity-theoretic lower bounds, in the sense that it avoids the existing barriers (Relativization, Natural Proofs, Algebrization), and (to the best of my knowledge) no new barriers are known. The idea is to convert problems in Circuit Complexity to problems in Representation Theory and Algebraic Geometry, and then use the algebraic results to reason about classical circuit complexity. We are (again, to the best of my knowledge) not even close to being able to use GCT to get significant circuit lower bounds. Joshua A. Grochow's dissertation is a good starting point for an overview of GCT (https://home.cs.colorado.edu/~jgrochow/grochow-thesis.pdf), as are these notes by Blaser and Ikenmeyer (https://pcwww.liv.ac.uk/~iken/teaching_sb/summer17/introtogct/gct.pdf).