Mathematical problems that are too computationally complex, so we use probability distributions instead?

263 Views Asked by At

Some mathematical truth statements/questions can not be answered with certainty. This could be because

  1. The computation required to answer them is finite but requires a universe-sized computer to work on it for longer than the age of the universe.
  2. The question has an answer, but the computation required to answer them is infinite, so that they can never be answered. (I'd like to know an example of this).
  3. Godel's theorem??? (despite reading a lot on this, I still don't fully understand what it means).

But my question is: Are there examples of problems that are not solvable in an exact way, but where we nevertheless have some exact algorithm that gives us a meaningful probability distribution over answers to the problem?

For example, we could have a function $f:X \to \{\text{yes,no} \}$, and a function $g:X\to [0,1]$, where we know that $f$ is fully defined for every $x\in X$, but we cannot compute it (or it takes a long time to compute so that we can only compute it for say 100 values in a year with a super computer), and where if we calculate $g(x)$ for various $x$, it will on average make meaningful probability statements about what the value of $f(x)$ is. It could be that while $f$ is computationally intractable, $g$ is very efficient.

Is there an example vaguely like this? (it doesn't have to fit my exact formulation).

To be clear, I'm not asking for "approximations" to $f$, but specifically for probability distributions over answers of $f$. I am also not asking for an approximate probability distribution to another probability distribution. I'm asking for a probability distribution over mathematical truth statements that while they may not be computable have an actual true answer.


Edit: The Prime Number Theorem is a very good example. I am also wondering if there are any good examples of theorems that are unproven, but where there is nevertheless some sort of probability estimate possible regarding whether it's true or not. (I've read something regarding the Goldbach conjecture here, but the argument seems obviously wrong).

3

There are 3 best solutions below

0
On BEST ANSWER

One very general answer to this is the "logical induction" algorithm described in this paper. Also, the introduction to this paper has some discussion of what a "meaningful" function $g$ might look like.

I should say ahead of time that, like many results in the vein of "this function is computable", the algorithm in question is not going to be remotely efficient: it will take a very long time (an impractical length of time) for it to make nontrivial probability estimates.

In any case, the idea is that the logical inductor will produce consistent probability estimates for all mathematical statements you give it; the more time you give the logical inductor to work, the more reasonable the probability estimates. (By contrast, there is no similar algorithm that produces truth values for all mathematical statements.) Essentially, the longer the logical inductor runs, the more facts it learns, and it uses those facts to guess at truth values it hasn't learned.

(It is easy to come up with an algorithmic prover that, inefficiently, finds more and more proofs of mathematical statements over time. Just iterate over all finite strings, and for each string, check if it's a proof of something. Such a prover feeds into the logical inductor, and gives it a basis for making its guesses. Of course, there are unprovable statements that the prover will never say anything about, and we still want the logical inductor to assign them probabilities.)

You should imagine the algorithm working as follows. Suppose that the statement we're interested in is "The $3^{\text{rd}}$ digit of $\pi$ after the decimal is $1$" encoded as a statement in Peano arithmetic or something. Over time, the logical inductor's beliefs might change as follows:

  1. Initially, it will assign the statement a value around $0.5$, on the grounds that all statements are either true or false.
  2. It takes a relatively short time (compared to the next stages) to realize that decimal digits are values in $\{0,1,\dots,9\}$, and this convinces the logical inductor to drop the estimate down to $0.1$.
  3. After proving some estimates of $\pi$, such as $\pi \approx \frac{22}{7} = 3.1415...$, the logical inductor would have more refined estimates of the digits. For instance, maybe it would think that the digit is $1$ with probability $\frac13$ and $2$ with probability $\frac13$, since $3.141$ and $3.142$ are both close to $\frac{22}{7}$, with the remaining probability mass assigned to other digits.
  4. Eventually, a proof is found that $3.141 \le \pi < 3.142$, and the statement has probability $1$.

(These are only intuitions, since of course I don't have a practical implementation to test this on; it might turn out that there are ways to bypass some of these stages.)

The same would happen (over a different scale of time) to something like the $1000000000^{\text{th}}$ digit of $\pi$, or to other statements we don't know the truth value of.

0
On

I think an example is factoring, rephrased as a decision problem ("Does $n$ have a prime divisor between $a$ and $b$?"). By Rubinstein's theorem, this is decidable in worst-case $n^{\frac{1}{3}+o(1)}$ time. On the other hand, if we use a deterministic version of GNFS and restrict the time to be subexponential, we can't prove it is correct, but if it runs out of time we can output some small probability.

A similar example is provided by the AKS prime test versus Miller-Rabin, however these are both polynomial so we can only quantify the difference with fine-grained time complexity (a specific model of computation). But in the former example the advantage is invariant under polynomial time-translations of the model.

And trivially, for a finite search problem like the above, there is a time-probability trade-off. Randomize the search space, or pseudo-randomize if that's good enough. Now if we check $X\%$ of the space without finding anything then there's an $X\%$ chance that there's nothing there to be found.

As far as being able to take an arbitrary statement and put a probability on that without actually proving it, I'm not aware of any standard way of doing this, however I've explored a few different approaches myself independently. One example is that we can extend arithmetic with a kind of quantifier interpreted as "it is probably true that if we pick a number $n$ then the following statement will hold of $n$". Then we can prove such statements for certain uncomputable numbers derived from Chaitin's constant. Here is a brief explanation of this theory (my question seeking more information is still unanswered). Another idea I've been interested in more recently is assigning a probability measure to a set of consistent extensions of a theory, then understanding the "probability" of a statement as being the measure of the set of such extensions that can't refute it. Misha Lavrov's excellent answer represents another general approach to the problem.

7
On

Let us start with a quote from Feynman: $$\text{ For my money Fermat’s Theorem is true. }$$

We should note that this comment was made in context to Feynman’s checking of the validity of FLT and he made a calculation that the probability FLT would be false was “certainly less than $10^{-200}$”. Silvan Schweber encountered among Feynman’s papers, a sketch of his unpublished probabilistic analysis of FLT. That is, he didn’t try to formally prove the conjecture but rather tried to show that it is statistically improbable that it is false.

Now, what did Feynman do with this idea of using probability to study Fermat’s Last Theorem?


He first asks: “What is the probability that $N$ is a perfect $n^{\text{th}}$ power?”. Feynman’s approach to answering this was to start by mapping the unit interval defined by two large consecutive integers ($N$ and $N+1$) into the shorter interval of the $n^{\text{th}}$ roots of the unit interval, note the mapping is monotonic.

To determine the length of the shorter interval, Feynman wrote it as: $$l = (N+1)^{\frac1{n}}-N^{\frac1{n}} \approx \frac{N^{\frac1{n}}}{nN}\, \text{ (N large) } \, \tag 1$$

Feynman then said that the required prob. is given by $(1)$ and therefore the probability that $x^n + y^n$ is a perfect $n^{\text{th}}$ power is: $$P = \frac{(x^n + y^n)^{\frac1{n}}}{n(x^n + y^n)}\tag 2$$ and thus the total probability that $x^n + y^n$ is a perfect $n^{\text{th}}$ power [that is $z^n$] for any $x > x_0$ and $y > y_0$ is equal to: $$\int_{x_0}^{\infty}\int_{x_0}^{\infty} \frac{1}{n} (x^n + y^n)^{-1+\frac1{n}} \, dx \, dy = \frac{2}{nx_0^{n-3}}c_n$$ $$\text{ where } c_n = \frac12 \int_{0}^{\infty}\int_{0}^{\infty} (u^n + v^n)^{-1+\frac1{n}}\, du\, dv$$


I can write more but I have given the gist of Feynman’s brilliance. For further references:

$[1]$ Number-Crunching: Taming Unruly Computational Problems from Mathematical Physics to Science Fiction, Paul J.Nahin (pp. 1-13) A full account is given here, out of which I have given an excerpt.

$[2]$ As shared by the OP in request to my comment, the same is given here as well.