In Kleene, Introduction to Metamathematics, $\S42$ (end), it is shown that if formal number theory is simply consistent, then its consistency cannot be proven formally within the system. In other words, any consistency proof of the system must inherently have a step or steps that are non-elementary and not formalizable.
Later, in $\S79$, we are given a consistency proof of Robinson's system--formal number theory without induction and with a couple equality axioms added. This is done in a Gentzen type system through his "Hauptsatz" (Normal Form) theorem.
Is this method through Gentzen type systems and the "Hauptsatz" theorem non-elementary?
It should be, right? Later in the book, it marks when a theorem is non-elementary but it does not do so with this theorem.
In the same section, a brief heuristic account for Gentzen's proof of the consistency of classical number theory with transfinite induction to $\epsilon_0$ is given. There is a quote saying
"...the reasonings used in Gentzen's consistency proof other than this transfinite induction are of sorts that are formalizable in the systems." (479)
Is that implying that "Hauptsatz" and Gentzen type systems are formalizable? Or does that mean that Gödel's second theorem does not apply to Robinson's system?
I have no reason to think that the latter is the case.
Everything here is formalizable (although formalizing "induction along $\epsilon_0$" in the language of PA actually takes a bit of thought); the issue is what formal system is being used. For example, the proof of the consistency of Robinson's Q is not a proof in Q but rather a proof in a stronger system.
The proof of Godel's incompleteness theorems is also formalizable, and in fact goes through in PA (or indeed much less). Specifically:
In particular, PA proves "If PA is consistent, then PA is incomplete." This does not contradict the consistency of PA, since that hypothesis is part of the statement being proved.
So there isn't any unformalizability going on here. Rather, what we have is a class of situations where we need to distinguish between provability in different systems: there are many different systems we're interested in, and various facts about some are proved in others.