Generally a theorem shows that you can derive extra information given some starting information in a nonobvious way, at least for the theorems we bother learning in textbooks. Put another way: I usually expect the utility of a theorem to be it enabling you to calculate something or prove another theorem that otherwise would be difficult.
So I'm confused why Bayes Theorem is a big deal, because AFAICT it tells you nothing the definition of conditional probability didn't already tell you, and requires exactly the same information to be available in order to apply it. The typical definition of conditional probability is:
$$ \mathbb{P}(A \mid B) = \frac{\mathbb{P}(AB)}{\mathbb{P}(B)} $$
Whereas Bayes Theorem is:
$$ \mathbb{P}(A \mid B) = \frac{\mathbb{P}(B \mid A)\mathbb{P}(A)}{\mathbb{P}(B)} $$
If you start with the first equation you can derive the second by just applying it once to the numerator. You still either need to know the intersection probability or the two values that the definition makes it trivial to calculate (the conditional probability times the prior).
Digging some on Wikipedia I found this excerpt from Bayes original essay:
He states:"If there be two subsequent events, the probability of the second b/N and the probability of both together P/N, and it being first discovered that the second event has also happened, from hence I guess that the first event has also happened, the probability I am right is P/b.". Symbolically, this implies (see Stigler 1982): $$ \mathbb{P}(A \mid B) = \frac{\mathbb{P}(AB)}{\mathbb{P}(B)} $$
So is Bayes real contribution just the definition of conditional probability? If so why does everyone focus on Bayes Theorem?
I agree with you that Bayes's theorem is indeed a pretty trivial consequence from the basic facts that
$P(A \cap B ) = P(A )* P(B|A)$
and
$P(A \cap B ) = P(B \cap A) = P(B) * P(A|B)$.
Indeed, I don't remember Bayes' law by rote memory but I do know how to quickly derive it from this whenever I need it.