Understanding why the Halting problem can't be solved

4.9k Views Asked by At

I understand that this thread may be similar to one of my old threads but it's not the same since now I will providing my insight about what I understand.

I understand what the halting problem says, but I can't understand why it can't be solved. My professor used a diagonalization argument that I am about to explain.

The cardinality of the set of turing machines is countable, so any turing machine can be represented as a string. He laid out on the board a graph with two axes. One for the turing machines and one for their inputs which are strings that describe a turing machine and their according input and then he started to fill out the grid with Accept or reject or loop for ever. He then drew out a diagonal along that grid and created a new turing machine with the values in the diagonal. I understand that we are trying to prove that the given language is undecidable. Why is this turing machine that is not in the initial graph important? and how does this lead to the conclusion that the halting problem can't be solved?. I understand why the real numbers are uncountable, so there is no need for explaining that.

I need to understand the proof of why the halting problem can't be solved with the diagonalization argument.

4

There are 4 best solutions below

3
On BEST ANSWER

Reese did a very good job of explaining how a proof of the undecidability of the halting problem may be written in a way that explicitly makes use of diagonalization. I'll offer an informal gloss.

Diagonalization is employed to reach a contradiction that forces us to abandon an assumption. Here the assumption is the existence of a TM (let's call it $H$) that decides the halting problem.

Diagonalization is applied by building a table $A$ such that that $A_{i,j}$ says whether the $i$-th TM halts on input $j$. We then say, "Let's build a TM that on input $i$ does the opposite of TM $i$."

Now the question is, "Does such a Turing machine exist?" It's not hard to see that given $H$, one can build the desired TM; but then this TM is not in the table, because it was built by diagonalization and yet the table lists all TMs (TMs can be enumerated) ... Contradiction!

The only assumption we can give up is the existence of the TM $H$ that decides the halting problem, which is what we do, so that we are not forced to admit that there is a TM that disagrees with TM $i$ on input $i$ for all $i$.

How would we build the "diagonal TM" given $H$? Just as explained in the "other" proof: duplicate the input, let $H$ decide whether the input TM halts when it reads itself, and then do the opposite of what $H$ said. This last step is diagonalization.

4
On

It doesn't have to do with the uncountability of the real numbers, though the argument is similar.

Based on your most recent edit in response to the comments, it sounds like you don't want the traditional proof, you want the proof your professor was providing. What your professor was doing is in no way standard, so we can only guess; but here's what I think the argument was.

The Turing machines aren't just countable, they can be listed by a Turing machine. So we have an algorithm that lets us list the Turing machines, in order. Suppose we had an algorithm, $H$, that would tell us whether a given Turing machine would halt on a given input. We now construct a new algorithm, $T$, as follows: to determine $T(n)$, we look at the $n$th Turing machine, and ask $H$ whether it halts on input $n$. If it doesn't, we say $T(n) = 0$. If it does, we run that $n$th Turing machine on input $n$, and eventually get a result; let $T(n)$ be that result plus one.

Now, $T$ is an algorithm, so it's represented by a Turing machine. But all the Turing machines were on our list, so $T$ is on our list. Say it's the $N$th Turing machine on our list. What's $T(N)$? If $T$ doesn't halt on input $N$, then at the $N$th step of our construction we decided that $T(N) = 0$, so $T$ does halt on input $N$, a contradiction. If $T(N) = k$ for some $k$, then we decided that $T(N) = k + 1$, another contradiction. But those were the only options - either $T(N)$ doesn't halt, or it eventually outputs some $k$. So this contradicts the supposition that $T$ is actually a Turing machine, which means that the key element - the halting algorithm $H$ - must not exist.

2
On

Personally, when I learned the halting problem, I like to think of the machine leading to contradiction as the following concrete C program:

#include <stdio.h>
#include <stdbool.h>

/* If we assume the existence of a halting machine 'does_it_halt'.  */
bool
does_it_halt(void *turing_machine, void *input);

/* Then we can build the following machine 'absurd'.  */
void
absurd(void *input)
{
  if (does_it_halt(input, input))
    {
      while (true) {/* infinite loop  */}
    }
  else
    {
      return;
    }
}

/* Now if we run the machine 'absurd' with itself as input, does it halt?  */
int
main(void)
{
  absurd(absurd);
  return 0;
}
2
On

How do diagonalization proofs work?

Why is this turing machine, that is not in the initial graph, important?

  1. Propose a statement like "there exists a TM that solves the halting problem" or "the reals between 0 and 1 are countable", which we are going to disprove.
  2. Construct a way to enumerate (count off) all the items that are part of the thesis (here: Turing Machines or reals). The items (TMs, reals, ...) must themselves consist of sub-items (e.g., symbols for TMs, digits for reals) so we can make a proper table. It matters not that the table has infinitely many columns and rows. Make this construction as easy/trivial as possible, for bonus points. Easy enough for TMs and reals.
  3. Find a path through your enumeration that hits every single cell. This is where the diagonalization comes in; it makes sure that, unlike following one of the axes to infinity, every cell is reached in a finite amount of steps.
  4. Construct a new item from this walkthrough that is guaranteed not to be a part of the original table. For reals, just add 1 to each digit you encounter (wrapping over). For TMs, do whatever your prof did. As we constructed the table so that it contained all items, we have a contradiction, as this new item is obviously not contained in the table (it is different from every item we encountered / it is new).

Step 4 is the answer to your question:

All our effort was done solely to construct a Turing Machine that is not a member of the set of all Turing Machines. This gives us a contradiction. We carefully constructed our little algorithm so that every step is dead simple and obviously (or, if you so wish, provably) correct, so the only part of our whole diagonalization argument that is negotiable is the conjecture. Hence the conjecture must be false.

Addendum, how to apply for Turing Machines

To apply the diagonalization method for Turing Machines and the halting problem:

  • As hinted to above, we suppose that there is a turing machine H(h,i) that takes two parameters (another TM and some arbitrary input) and decides whether that other TM will halt for said input, or not. This is the definition of the the halting problem.
  • Note that the set of possible inputs (the domain of i) is, again, the set of all turing machines (codified in some textual representation)! This is small enough. This is a similar approach as to say that for reals, it is enough to look at the intervall [0;1] instead of all numbers, for making the proof simpler. We show that the halting problem can not even be solved for this subset of possible machines; i.e., machines that work on the "source code" of machines.
  • The mentioned table lists the machines on one axis, and inputs on the other. Each cell (think of Excel) is the result of running our supposed machine on that specific input. Steps 2 and 3 are thus mechanically performed and need no knowledge whatsoever about Turing Machines.
  • In step 4, we construct a new turing machine H' by going diagonally through the table and making that new machine H' so that it halts if the cell contains FALSE, and does not halt if the cell contains TRUE. This is very easy to do since H' can contain H as a subroutine and ask that one, then either exit or enter an infinite loop (which is easy to program).
  • As with the reals, we have now created a new entry for our table which by mere definition differs from every other entry. q.e.d.