Consider some Hilbert-type formal system of predicate calculus. I will use the one from Kleene's "Introduction to Metamathematics" 1971. While developing predicate calculus in this style we list set of axioms (or axiom schemata depending on your choice) and set of inference rules. Then concepts of proof and theorem are defined.
But after thinking about it and being honest with myself, I have to admit that I have no idea for why specific axioms and inference rules are chosen (or any equivalent ones). I would like to hear your ideas about it and I will give you all my hypotheses for explanations and why I am unsatisfied with them.
For example, consider axiom schema where $A,B,C$ are metamathematical variables standing for formulas.
$$ 1. (A \Rightarrow B) \Rightarrow ((A \Rightarrow (B \Rightarrow C)) \Rightarrow (A \Rightarrow C)) $$
Also consider the following inference rules where $C$ denotes a formula not containing variable $x$ and $A$ denotes any formula:
$$2. \textrm{If } C \Rightarrow A(x) \textrm{ then } C \Rightarrow \forall x(A(x)) $$
$$3. \textrm{If } A(x) \Rightarrow C \textrm{ then } \exists xA(x) \Rightarrow C $$
Hypothesis 1: We take these and other axioms / axiom schemata / inference rules just because they work. For example, consistency can be shown. Also, because most of the "informal" theorems in mathematics can be reconstructed formally using this kind of predicate calculus. But then how one would come up with them? It seems pretty impossible to choose some of the equivalent sets of axioms/ inference rules just by wanting them to be consistent and for them to be able to formalize most of the "informal" mathematics. I think there should be some kind of intuition or interpretation behind this.
Hypothesis 2: Our choice is based also on the fact that under the interpretation it makes sense. But I have a hard time understanding that. For example, if one considers axiom schemata 1 then interpretation would be "If if $A$ then $B$ then if if $A$ then if $B$ then $C$ then if $A$ then $C$" which for me is extremely confusing and therefore not as readily accessible for intuition. For inference rules 2 and 3 I have no convincing interpretation and I would really appreciate if you could give your interpretation that convinces you for my given inference rules.
Hypothesis 3: Probably one could argue by model-theoretic arguments which formulas we should choose as axiom schemata but I am not that convinced here because for model theory of predicate calculus using finitary means we can only analyze cases when domain of variables of predicate calculus is finite, but usually, in practice, we use theories with an infinite domain. Maybe one can argue that if it holds for finite domain then it is reasonable to assume that it holds for infinite domain, but I am not sure. And then also one could ask why we choose truth tables in the way we do. Also, considering that some types of logic do not have truth tables, but still have axioms/ axiom schemata/ inference rules, I feel that this intuition might not be general enough.
Hypothesis 4: Probably one could also argue that we want certain deductive rules to hold like deduction theorem. But I always felt that philosophically they are not necessary and are just useful tools, but in principle, everything could be developed without these deductive rules. So one cannot argue using them because they are not fundamental to the theory in the first place.
Hypothesis 5: It might be that there is an equivalent predicate calculus with axioms and inference rules that are simpler and more intuitive. If that is the case, can someone give some references for what it is and why is it simpler and more intuitive?
I would appreciate your help and advice, and your own philosophical / intuitive thoughts about this. Especially for me it is interesting, how do you think about these things and how do you convince yourself that what you are doing is reasonable.
First of all, we have to consider that modern mathematical logic has many proof systems : Hilbert-style, Natural Deduction, Sequent calculus, Tableaux, Resolution, Equational.
Hilbert-style was the first one to be "codified" by G.Frege in 1879, and further developed by Russell and Whitehead's Principia (1910-13) and culminated into the first modern math log textbook : Hilbert and Ackermann's Grundzüge der Theoretischen Logik (1928).
In 1935 G.Gentzen developed Natural Deduction and Sequent calculus, that are rules-only proof systems.
Going back to Hilbert-style proof system, based on axioms and rules, we have to start from propositional calculus, where we have a big amount of different systems.
The tradition starting from Frege adopted few axioms and one only rule : Modus Ponens.
The first "variation" comes with the choice of the basic propositional connectives. Frege adopted $\to$ and $\lnot$ while Russell & Whitehead used $\lor$ and $\lnot$.
The approach to axiomatization of propositional calculus was reminescent of Euclid's Elements, trying to introduce "self evident" axioms.
You can read in this sense Russell & Whitehead's approach, avoiding the "tricky" $\to$ (material implication) and adopting the "more easy" $\lor$ (disjunction).
This approach has been further streamlined by Peter Andrews, with MP rule and the following axioms schema:
Quite similar is Joseph Shoenfield's system, with ony one axiom (schema): $\lnot A \lor A$, and rule-versions of the above axioms, plus Cut Rule in place of MP.
But also Frege's original approach was further refined; see e.g. A.Church's system.
The said evolution was driven by different goals :
evidence of the axioms chosen;
reduce the number of axioms (and connectives);
"pragmatic" needs: adopt the axioms necessary to prove as soon as possible the Deduction Theorem and the Soundness and Completeness one.
The second driver produced the single axiom-single connective versions, at the expenses of evidence (and sometimes intelligibility).
But the reductionist approach is useful for proving meta-theorems: usually they require proofs by induction and a reduced number of connectives and axioms means less cases to be surveyed.
The third driver is essential: we want proof systems that are sound and complete, i.e. able to derive all valid formulas (according to the suitable semantics).
From this point of view, the most successfull propositional axiom systems: Church's, Mendelson's, usually made of three axioms and one rule (with axioms: $A \to (B \to A)$ and $(A \to (B \to C)) \to ((A \to B) \to (A \to C))$ that are sufficient for the proof of the DT)) are very practical, compared also to Natural Deduction with many rules.
The Natural Deduction approach was transferred to Hilbert-style systems by Bernays and Kleene, trying to merge the benefits of the two: metalogical "practicality" with the more intuitive approach to the connectives, with pair of introduction-elimination rules (and the corresponding axioms).
In addition, this approach is suitable for Intuitionsitic Logic, where the connectives are not inter-definable.
Conclusion (up to now): there is a trade-off (and a tension) between prgamatic reasons (your Hypos.1 and 4) and "evidence".
Very useful : John Corcoran, Three Logical Theories (1969).
The issue about predicate calculus is similar.
First of all, the choice of axioms is based on the previous choice about the propositional part.
Some proof systems choose to stay with only one rule of inference: Modus Ponens (see: Enderton) and thus they have to add suitable axioms for the quantifier; and again, we have in classical logic the choice about introducing both quantifiers as basic insread of defining one in terms of the other.
A different choice is to introduce also inference rules for quantifiers, following Frege's and R&W's systems, that used Generalization rule.
Again, we have different solutions to the goals discussed above: Kleene's system is again an "Hilbert-version" of Natural Deduction approach.
Again, the "minimalist" approach has many possibilities. Shoenfield uses the quantifier axiom: $A[x/a] \to \exists x A$ and the $\exists$-Introduction Rule : "If $x$ is not free in $B$, infer $\exists x A \to B$ from $A \to B$ (see also Tourlakis).
Here, IMO, we have a new "tension" in place. The axiom is quite natural (as well as Universal Instantiation axiom and rule are), but the introduction rule is not.
Generalization rule is more intutitive, but it is tricky, and we have problems in the interaction withthe Deduction Th.
Conclusion (provisional): also for quantifiers, Natural Deduction approach, with the pair introduction-elimination for each of them, is quite ... "natural".
If we adopt an Hilbert-style approach, the basic trade-off is between "naturality" (translating ND into axioms+rules) and paucity.
Obviously, ensuring that the system has the fundamental properties of Soundness and Completeness.