I apologize if this question is more suited for CS Stack Exchange, but since my question is primarily about rigor rather than the Halting Problem in particular, I felt like this is a place to post it.
My reference for the first proof are from these excellent slides (p. 32).
Firstly, two definitions. For any program P, code(P) is P in string format. For any program P, H(code(P), a) returns true if and only if the instance P(a) halts. Now, consider the following program in C syntax:
void D(string x) {
if (H(x, x) == true) {
while (true);
}
else {
return;
}
}
Now the proof of the impossibility of H existing follows immediately. D(code(D)) either halts or it does not.
- It halts, which means that
H(code(D), code(D))returns true. But by the definition ofD, this means thatD(code(D))does not halt (we fall into the first branch which leads into an infinite loop). - It does not halt, which means that
H(code(D), code(D))returns false. But by the definition ofD, this means thatD(code(D))halts (we fall into the second branch which leads immediately to areturnstatement).
This seems pretty formal and satisfactory to me, but I have doubts since the more mathematical exposition of the Halting Problem which I read was much more complicated and incomprehensible to me. Am I missing something?
If we consider programs with $2$ or more parameters, the argument is almost exactly the same. More interesting is the case of a halting function which decides whether a program with zero parameters halts.
We can consider this program:
void D() {
if (H(x) == true) {
while (true);
}
else {
return;
}
}
and try substituting code(D) for the free variable x. But here the problem arises. If we do that, we have changed D itself, which means that it isn't code(D) we are substituting into D in the first place! Let me know if the following fix succeeds:
We consider two programs:
void A() {
if (H(code(B)) == true) {
while (true);
}
else {
return;
}
}
void B() {
A();
}
A() either halts or it does not.
- It halts, which means that
B()halts, which means thatH(code(B))returns true. But by the definition ofA, this means thatA()does not halt (as before, we fall into the first branch). - It does not halt, which means that
B()does not halt, which means thatH(code(B))returns false. But by the definition ofA, this means thatA()halts (as before, we fall into the second branch).
I do not really doubt the correctness of these results, but I am concerned about their possible lack of rigor. I would appreciate feedback about their correctness and formality, and also an explanation why the mathematical version of the Halting Problem is much more complicated. Thank you. :)
P.S. This is a response to spaceisdarkgreen's excellent answer.
I tried making his suggestions concrete. So, consider the following function:
void DHasCodeD() {
if (H(code(D), code(D)) == true) {
while (true);
}
else {
return;
}
}
The halting problem now transfers over to the program D(x). This answers my initial question: supposing that H(x, y) exists, we can also derive a contradiction from a zero-parameter function.
However, what's bugging me here is that we are using a halting oracle with two parameters. We have not shown that H(x) does not exist. After all, what if H(x) is weaker than H(x, y)?
We try the following:
DZero() {
if (H(code(DHasCodeD)) == true) {
while (true);
}
else {
return;
}
}
The proof by contradiction transfers again. But in my mind, this proof is problematic because it presupposes the existence of a halting oracle with two parameters. To be convinced, I would have to see a proof by contradiction involving only H(x).
First of all we let $halt_1:S_1 \times \Sigma^* \rightarrow \{0,1\}$ denote the halting function for programs with one input. Here $S_1$ is a suitable subset of $\Sigma^*$ denoting all syntactically correct programs (with one input). So given any two strings $s \in S_1$ and $t \in \Sigma^*$ we say that $halt_1(s,t)=1$ iff the program with code $s$ halts when given the input $t$. We say that $halt_1(s,t)=0$ iff the program with code $s$ does not halt when given the input $t$.
The first point is that there is no program that takes on two input strings and computes the function $Halt_1:\Sigma^* \times \Sigma^* \rightarrow \{0,1,2\}$. In other words:
Here we have $Halt_1(i,j)=halt_1(i,j)$ for all $i \in S_1$ and $j \in \Sigma^*$. However, when $i \notin S_1$ and $j \in \Sigma^*$ we have $Halt_1(i,j)=2$.
Now using the result in previous paragraph we can show several more results. First of all consider the function $halt_0:S_0 \rightarrow \{0,1\}$. Here $S_0$ is a suitable subset of $\Sigma^*$ denoting all syntactically correct programs with no input. So given a string $s \in S_0$ we say that $halt_0(s)=1$ iff the program with code $s$ halts. We say that $halt_0(s)=0$ iff the program with code $s$ does not halt.
Now consider a function $Halt_0:\Sigma^* \rightarrow \{0,1,2\}$. Here we have $Halt_0(i)=halt_0(i)$ for all $i \in S_0$. However, when $i \notin S_0$ we have $Halt_1(i)=2$.
Now using the fact that $Halt_1:\Sigma^* \times \Sigma^* \rightarrow \{0,1,2\}$ is not a (total) computable function we can show that $Halt_0:\Sigma^* \rightarrow \{0,1,2\}$ is not a (total) computable function. This is shown by the usual "reduction". We first suppose that $Halt_0:\Sigma^* \rightarrow \{0,1,2\}$ was a (total) computable function. Then there exists a program that computes $Halt_0:\Sigma^* \rightarrow \{0,1,2\}$. Using the existence of program in previous sentence, we can show that there exists a program that computes the function $Halt_1:\Sigma^* \times \Sigma^* \rightarrow \{0,1,2\}$. However, that would mean that $Halt_1:\Sigma^* \times \Sigma^* \rightarrow \{0,1,2\}$ is computable (which was mentioned above to be false). Therefore our assumption of $Halt_0:\Sigma^* \rightarrow \{0,1,2\}$ being a computable function was incorrect.
At the risk of confusing things a bit further, I think there is one more thing to mention. It would happen often that one would not see a distinction between the sets $S_0$ and $S_1$. Instead a single set $S$ will be used. In the latter case, a single program can be thought of as calculating 0-ary,1-ary (2-ary, n-ary etc.) partial computable functions.
Whether such a distinction between $S_0$ and $S_1$ is necessary or not depends on the specific definition of computational model (more specifically, conventions regarding input variables).