In his axiomization of the positive fragment of Classical Logic, Hilbert included the axiom
$(B \rightarrow A) \rightarrow [(C \rightarrow A) \rightarrow (((B \rightarrow C) \rightarrow A) \rightarrow A)]$
along with the usual conjunction and disjunction axioms, Weakening and Self Distribution. All of those are good. But this monster is about the most opaque axiom I have seen. It has tantalizing hints of Pierce's Law $((A \rightarrow B) \rightarrow A) \rightarrow A$ in it, and clearly a variant of it can be obtained using Pierce's Law and Weakening. It is also used in his axiomization of the implication fragment of Classical Logic. If I could extract Pierce's Law, I could go to town. I can get a strong version of Chain, the Extended Law of Excluded Middle, and all of the distribution laws from Pierce's Law.
But this thing, I haven't gotten anything out of it. The best that I have gotten is $((A \rightarrow A) \rightarrow A) \rightarrow A)$ which doesn't seem to go anywhere. Nor, given everything else, have I been able to prove this axiom. I don't even have a name for it. The equivalence of $A \rightarrow B$ with ${\neg}A \lor B$ is not available in the positive fragment of Classical Logic since negation is not available. Additionally, since it also in the implication fragment, it must be intended to obtain some set of implication theses, neither disjunction, nor conjunction are available there. So, Hilbert must have intended the other implication axioms to be used which are just Weakening and Self Distribution. There is no clue as to what he expected to get out of it (unlike Pierce's Law where Pierce explicitly said it was to get the Law of Excluded Middle -- which, in restricted systems, it does not do, but does do so much else).
I have tried using Self Distribution, Importation, Permutation, Assertion, all sorts of substitutions, etc., but haven't a bit of luck in either proving it or using it. I have verified that it satisfies a truth table model to make sure that it has been transcribed correctly. Nor have I found anything on the web. I found this is in the book "Propositional Logics" 3rd ed. by Richard L. Epstein, pages 83 and 85, questions pages 85-86. He only refers to it in a single question, which is whether or not it can be replaced by Pierce's law for the implication fragment, but then he also says to prove that adding Pierce's Law is necessary for that axiomization is strongly complete. Those questions together suggest that it cannot be replaced by Pierce's Law.
Does anyone have a clue as to how to either prove this monster, or how to use it in a Hilbert / Frege type of system (which, of course, what he would have been using)? What in the world is this axiom intended to do? I do not believe that Hilbert put in there just to confound everyone in the future.
Thanks.
Edit: With a bit of fiddling, I see a possible path forward. What follows is not a formal proof of anything, but shows a possible direction with a bit of hand waving. Please overlook any mismatched parentheses here or there. Those are just typos, I tried to avoid them, but pasting things back and forth may have caused a slip.
So, start with the original equation.
$(B \rightarrow A) \rightarrow [(C \rightarrow A) \rightarrow (((B \rightarrow C) \rightarrow A) \rightarrow A)]$
Now, as a starting point, make the substitution $B \leftarrow A$
$(A \rightarrow A) \rightarrow [(C \rightarrow A) \rightarrow (((A \rightarrow C) \rightarrow A) \rightarrow A)]$
We know that we have $A \rightarrow A$ by using Ax 1 and Ax 2, so use detachment (Rl 1).
$(C \rightarrow A) \rightarrow (((A \rightarrow C) \rightarrow A) \rightarrow A)$
Now use Self Distribution (Ax 2).
$((C \rightarrow A) \rightarrow ((A \rightarrow C) \rightarrow A)) \rightarrow ((C \rightarrow A) \rightarrow A)$
Just go mad with Self Distribution (the nested occurrences will make for a very painful proof, but should work).
$((((C \rightarrow A) \rightarrow A) \rightarrow ((C \rightarrow A) \rightarrow C)) \rightarrow ((C \rightarrow A) \rightarrow A)) \rightarrow ((C \rightarrow A) \rightarrow A)$
This is a mess, so lets define a local constant $R := (C \rightarrow A)$
$[((R \rightarrow A) \rightarrow (R \rightarrow C)) \rightarrow (R \rightarrow A)] \rightarrow (R \rightarrow A)$
This is starting to look familiar, so lets define a couple of additional constants $X := (R \rightarrow A)$ and $Y := (R \rightarrow C)$
$[(X \rightarrow Y) \rightarrow X] \rightarrow X$
This is clearly Pierce's Law. The problem is that $X$ and $Y$ are expressions. But, using a model checking tool, I checked that
$X = ((C \rightarrow A) \rightarrow A) \leftrightarrow A$
$Y = ((C \rightarrow A) \rightarrow C) \leftrightarrow C$
| (C -> A) | -> | A |
+------------------+----+-----+
| F T F | F | F |
| | | |
| T F F | T | F |
| | | |
| F T T | T | T |
| | | |
| T T T | T | T |
+------------------+----+-----+
| (C -> A) | -> | C |
+------------------+----+-----+
| F T F | F | F |
| | | |
| T F F | T | T |
| | | |
| F T T | F | F |
| | | |
| T T T | T | T |
+------------------+----+-----+
It is almost true for $A$, but is true for $C$. We can easily show one direction.
$A \rightarrow ((C \rightarrow A) \rightarrow A)$
$C \rightarrow ((C \rightarrow A) \rightarrow C)$
If we can get a proof in the other direction for at least $C$, then we have equivalence, and can substitute $Y \leftarrow C$ giving
$[(X \rightarrow C) \rightarrow X] \rightarrow X$
which is just Pierce's Law with arbitrary $C$, but flawed for $A$. I don't have an immediate solution for that, but at least it is a direction to examine.
This is close. It is possible that the difference between $A$ doesn't actually make a difference or that something else can be done. Time to go to bed and let my subconscious work on it.
Edit 2: By request, here are images of the four pages in Epstein's book which are relevant.




I am really quite busy right now, but I do happen to know the solution, so this is going to be a "drive-by" proof. Apologies if something won't make sense, or if I make a serious typo, but I probably won't have time to correct/clarify it until next week.
First, observe that $((a \rightarrow b) \rightarrow a) \rightarrow (a \rightarrow a \rightarrow b) \rightarrow a$ is an intuitionistic tautology, so you can somehow derive it from the first two axioms. I would just translate a proof from another system.
Now, we have:
This gives you Peirce's law, so you can go to town!