The theorem of conditional probability states that $$P(A \mid B)=\frac{P(A \cap B)}{P(B)}$$
Bayes' theorem on the other hand tells us the probability that a prior condition is true if a given event has taken place is
$$P(B \mid A)= \frac{P(B \cap A)}{P(A\mid B)P(B)+P(A\mid B^C)P(B^C), }$$
However, what I'm unable to understand is why it's not simply not $$\frac{P(B \cap A)}{P(A)}$$
Everything you write is equivalent. Bayes' theorem can be proved as follows: $$ P(B\mid A)=\frac{P(A\cap B)}{P(A)}=\frac{P(A\mid B)P(B)}{P(A)}, $$ where in the second line we used that $$ P(A\mid B)=\frac{P(A\cap B)}{P(B)}\implies P(A\cap B)=P(A\mid B)P(B). $$ For the expression you give for Bayes', it is just a rewriting with the law of total probability: $$ P(A)=P(A\mid B)P(B)+P(A\mid B^C)P(B^C). $$