Below is an explicit construction of a fixed point the existence of which is guaranteed by Kleene's fixed point theorem. I was wondering if there's any intuitive explanation of why the fixed point should be what it is (namely, $[s](r)$)? This construction looks like a magic to me (I don't think I would ever be able to come up with this construction on my own), so I wanted to see what intuition lies behind this construction, if any. If there's no intuition behind the initial stages of the proof, then at least suppose I've arrived at $(2)$. Is there any reason why I should think that exactly $[s](r)$ should be the fixed point? Or can this knowledge only be obtained by looking at all the formulas that I have for a long time and trying different options?
Kleene's fixed point theorem: If $H:N\to N$ is a total computable function and $U$ is a Godel universal computable function (defined here), then there is $n$ such that $U(n,x)=U(H(n),x)$.
Here's an explicit construction of a fixed point.
Convention: capital letters denote computable functions, and the corresponding lowercase letters denote the numbers of programs that compute those functions. Also, $[p](x)$ (alternatively, $U(p,x)$) denotes the result of application of program number $p$ on input $x$.
Consider $F(x)=U(x,x)$. This is a computable partial function. Define $V(n,x)=U(F(n),x)$, which is also a partial computable function. We can write $$V(n,x)=[[n](n)](x)$$
Now consider the total computable function $S:N\to N$ such that $U((S(n),x)=V(n,x)$ (which exists because $U$ is Godel universal). We can write this equality as $$V(n,x)=[[s](n)](x)$$
Thus $$\tag{1} [[s](n)](x)=[[n](n)](x)$$
Now consider the total computable $H\circ S$, say the composition is computable by program number $r$. Then $$\tag{2}[r](x)=[h]([s](x))$$
Now $[s](r)$ should be the fixed point. To prove this, we need to prove that $$U([s](r),x)=U([h]([s](r)),x)$$ or equivalently $$ \tag{3}[[s](r)](x)=[[h]([s](r))](x)$$
First, we have $[[h]([s](r))](x)=[[r](r)](x)$ from $(2)$, and then $[[r](r)](x)=[[s](r)](x)$ from $(1)$. This proves $(3)$.
One intuition that would be helpful for coming up with this sort of thing yourself is that these fixed point constructions are related to diagonalization. A very general case of this is Lawvere's fixed point theorem, but that is even essentially just using a very general setting to interpret the sort of fixed point function one would define in (untyped) lambda calculus:
$$λf. (λx. f(x\ x))(λx. f(x\ x))$$
Applied to a given $f$, this satisfies the equivalence:
$$(λx. f(x\ x)) (λx. f(x\ x)) \\ = \\ f ((λx. f(x\ x))(λx. f(x\ x))) $$
So is a fixed point of $f$. This involves two cases of self-application: $x$ is applied to itself and the subterm $λx. f(x\ x)$ is applied to itself. This double use of self-application is the essence of diagonal arguments.
The fixed point theorem you cite is doing something very similar, but it needs to talk about mediating between functions and their encoding as natural numbers. If we ignore some of that, we get:
$$ f\ x = x\ x \\ v\ n\ x = f\ n\ x = n\ n\ x \\ s\ n\ x = v\ n\ x = n\ n\ x \\ r\ n = h\ (s\ n) = h (λx. n\ n\ x) $$
So, at this point, $r$ should look very similar to $λn. h(n\ n)$, which is one of the self-applications above. To get the fixed point the usual way, $r$ needs to be applied to itself, which is the reason you should suspect $[s](r)$, as $[s]$ is the self-application function. It is essentially:
$$ s\ r = r\ r = (λn. h(λy. n\ n\ y))\ (λn. h(λy. n\ n\ y)) $$
which is almost the term I wrote above, modulo $f = λx. f\ x$ (the difference in this case has to do with the encoding, I think).
Often these kind of diagonal arguments are used to deduce contradictions. For instance, Russell's paradox and the proof of undecidability of the halting problem carry out this sort of diagonalization to produce a fixed point of logical negation. Since logical negation doesn't have a fixed point, you deduce that the assumptions used to construct the fixed point are false. However, in the case of total computable functions, the premises for carrying out the diagonalization are true, and enable you to construct fixed points that do exist.