If P = NP, then the following problem would be solvable in polynomial time?

825 Views Asked by At

The problem is the following (searching a database with no structure):

There exists a function $f$, that maps a bitstring $z$ to a binary value. There is only one marked element $x$, such that $f(x) = 1$. For all other inputs $y \neq x$ it holds that $f(y) = 0$. The goal is to find the input $x$.

The function $f$ is a blackbox / oracle: its implementation in terms of logic gates, for example, is not known and each evaluation of $f$ (call to the oracle) takes a constant amount of time.

This problem can then be casted into a decision variant. Given a solution $x$ to the problem, it can be checked in constant amount of time, simply by evaluation of $f(x)$.

Guessing the inputs will take exponential time. So this "algorithm" is in NP.

Would P = NP imply that there is a classical algorithm on a Turing machine, that solves this problem in polynomial time? Can't it be quite easily proved that there exists no polynomial algorithm here? (Time complexity being polynomial in the number of input bits)

2

There are 2 best solutions below

5
On

If $A$ is any algorithm making at most $q$ queries to the blackbox $f$, $A$ is going to observe $q$ items of $f$. Since there are in total $2^n$ items, $q$ should be at least $2^n-1$, at the worst case. (After observing $2^n-1$ items, $A$ can definitely find $x$ without error. On the other hand, if $q<2^n-1$, then there is definite possibility of error.)

This argument holds whether P=NP or not.

More concretely, assume that $A$ is a deterministic algorithm making $q<2^n-1$ queries. Let $x_1, x_2, \dots$ be the sequence of queries $A$ makes, and let $y_1, y_2, \dots$ be the sequence of corresponding responses that $A$ gets. Then, $x_1$ doesn't depend on anything (except on the 'code' of $A$ itself), and $x_2$ depends on $y_1$, and $x_3$ depends on $y_2$, and in fact also on $y_1$ as well. In that way, eventually $x_i$ depends on $y_1, \dots, y_{i-1}$.

Now, fix any such algorithm $A$, and fix the response sequence $y_1, y_2, \dots, y_{2^n-2}$ to be all zero: $y_1=y_2=\dots=y_{2^n-2}=0$. This will completely determine the query sequence $x_1, \dots, x_{q}$. Now, there are at least two points which $A$ hasn't queried so far, and let us denote them as $x'_0$ and $x'_1$. We define two functions $f_0$ and $f_1$ as follows: both $f_0$ and $f_1$ respect the query sequence: $f_b (x_i)=0$ for all $b=0, 1$, and $i=1, \dots, q$. And then, set $f_0(x_0')=1, f_0(x_1')=0$, $f_1(x'_0)=0, f_1(x'_1)=1$. We can see that, since both $f_0$ and $f_1$ respect the query sequence, when $A$ interacts with either $f_0$ or $f_1$, it is guaranteed that $A$ will precisely ask queries $x_1, \dots, x_q$, and completely miss $x'_0$ and $x'_1$. Now, after the queries, $A$ will output its answer. Since $A$ is deterministic, its answer also depends on the response sequence $y_1, y_2, \dots, y_q$ (as well as the code of $A$). In any case, it is deterministic, so its answer will be something fixed, say $z$. If $z\neq x'_0$, then this means that $A$ errs when interacting with $f_0$. On the other hand, if $z=x'_0$, then this means that $A$ errs when interacting with $f_1$.

This adversary argument essentially forces $A$ to have worst cases each time and eventually to lose: the responses $y_i$ are adversarially chosen to deny $A$ any useful information, as much as possible.

3
On

First of all, remember that an individual instance of a problem isn't in $P$ or $NP$; rather, we talk about the complexity of a (parametrized) family of problems. For instance, $9\times 9$ Sudoku is in $P$; in fact, it's solvable in constant time! The constant may be large, but that's irrelevant. Instead, we have to talk about the complexity of Sudoku with respect to the board size. So you can't just have a 'single' bit string; rather, you need to have (at least one) bit string for every input length.

Putting that aside, even if you do assume that there's one bit string of each length that returns true, your problem still isn't in $NP$; this time, the issue goes the other direction. Remember that $NP$ means verifiable in polynomial time by a Turing Machine. You cannot simply treat $f$ as a black box; it must be represented in some fashion. If you want to try and obscure it, then you have to treat it as an oracle, and the problem you've constructed is in $NP^f$ — nondeterministic polynomial time with an oracle for $f$. Similarly, a deterministic polynomial-time algorithm would be in $P^f$. But there are already well known to be oracles $f$ for which the question 'goes either way'; there are oracles for which provably $P^f=NP^f$ (for instance, if $f$ is 'sufficiently strong' that its query power overwhelms $P$ and $NP$; I believe taking $f$ to be a Turing-complete set such as the set of all strings on which a particular Universal Turing Machine halts works here) and those for which provably $P^f\neq NP^f$ (in fact, if we choose a 'random' oracle $f$, then $P^f\neq NP^f$ with probability $1$).