Conditional probability is usually defined via:
$$P(A|B) := \frac{P(A\cap B)}{P(B)}$$
That's fine. $P(A|B)$ is then given the rather tendentious name of "the probability of $A$ given $B$". That in itself is also fine; it's just a name. But then — in the texts I have read, at least — $P(A|B)$ may actually be directly calculated in accordance with this name — that is, one assumes that $B$ has occurred, and then derives the probability that $A$ also occurs.
This to me seems like a leap which requires justification. Those two concepts are not the same, and yet they are being equated for the purposes of a substantive calculation.
Let me give the following basic example: An urn contains three red balls and two black balls. Two balls are sampled from the urn, without replacement. What is the probability that both balls are red?
This would typically be formalised with the sample space $\Omega = \left\{ (\text{ball }1, \text{ball } 2)\right \}$.
The requisite probability could then be calculated by counting permutations as $\frac{3\cdot 2}{ 5\cdot4}$.
But you could also calculate it more elegantly via conditional probabilities; one rearranges the conditional probability formula as
$$P(\text{ball 1 is red}\cap \text{ball 2 is red}) = P(\text{ball 2 is red}|\text{ball 1 is red})\cdot P(\text{ball 1 is red})$$
The latter term in the product is easily seen to be $\frac35$.
Crucially, the former term may also be easily determined if it is interpreted as "the probability that the second ball is red given that the first ball was red" — it is $\frac24$ (again giving $\frac{3\cdot 2}{5\cdot4}$ as the correct answer).
My problem is that this did not actually calculate conditional probability as defined by the formula — indeed, it can't have, because the entire purpose was to calculate $P(A\cap B)$, which is required to evaluate said formula. It instead used the fact that the outcome of this formula — that is, the conditional probability $P(A|B)$ — can instead be calculated by "pretending" that event $B$ has occurred, and then calculating the probability that $A$ also occurs.
My questions is: surely this requires demonstration? If so, how is it demonstrated?
I note that this example was about classical probability. Does it also require separate demonstrations for frequentist or subjective probability? I feel like they are different, because I couldn't think of an example of a frequentist or subjective probability question where the equivalence of these two concepts actually has any substantive application, as it did in the example above.
I think I need some general clarification.
Apologies if the question is too vague. I will do my best to clarify if that is the case.
If we are following your sample space set up, we have $20$ elements inside (they are ordered pair): $$ S = \{(a, b)|a, b \in \{R_1, R_2, R_3, B_1, B_2\}, a \neq b\}$$ Simply take the power set as the sigma algebra, and assign $1/20$ as the probability of all the singletons.
Now if you want to write every thing out, $$ A = \{(a, b)|a \in \{R_1, R_2, R_3\}, b \in \{R_1, R_2, R_3, B_1, B_2\}, a \neq b\}$$
$$ B = \{(a, b)|a \in \{R_1, R_2, R_3, B_1, B_2\}, b \in \{R_1, R_2, R_3\}, a \neq b\}$$
And you can proceed with the definition.