I am working through the book Statistical Inference by Casella and Berger. While I understand that most of probability theory is done heuristically, on a first passing of material I like to be more formal just to get an idea of things.
In one of their examples on conditional probability, they discuss the three prisoner's problem. It's supposed to show how conditional events can be miscalculated, but I'm having trouble trying to understand and formalize the correct version as opposed to the misguided one.
I put the pages below for reference.
In order to highlight my confusion, I'll introduce my 'naive' attempt to work through it. On a first pass, one could model this problem by considering the sample space $S$ to be $\{a, b, c\}$. Then $F$ is the power set, and $p$ is just the discrete probability function. Here, an outcome $e \in S$ would be interpreted as 'prisoner $e$ was pardoned'.
Now, with prisoner $e \in S$, we could determine various events. Let $L_e$ denote the event that $e$ is picked to be pardoned and live, and let $D_e$ denote the event that prisoner $e$ dies. Then with this formalism, $L_e = \{e\}$, and $D_e = S - \{e\} = \{f, g\}$ where $f$ and $g$ are the other generic prisoners.
If we let $A$ denote the event that prisoner $a$ lives, well then of course $A = L_a = \{a\}$ and $p(A) = 1/3$. Let $W$ denote the event that 'the warden says that $b$ dies'.
Now, one of the pitfalls the authors warn against is assuming that $W = D_e = \{a,c\}$. If one assumes this, then you end up with the erroneous calculations mentioned in the text. I can't think of a way to use the same sample space to model the 'correct' way to do this. Part of the posing of the problem says that the warden is telling this information to prisoner $A$. How can one modify the space to reflect this intuition? I'm also not exactly sure how they get the calculation that $p(W \cap A) = 1/6$. How would $p(W \cap A)$ be different than $p(W \cap C)$? Thank in advance for any assistance.


This problem is also commonly referred to as the "Monty Hall Problem". There's a lot of different explanations about this online under that name that you may find helpful in understanding what's going on here.
In order to properly determining the probabilities, it is important to properly define how the warden behaves under the different possible circumstances. For example, if the warden's normal behavior is to only say anything to people who are about to die, then prisoner A's chances are not 1/3, but now 0. Or suppose the the warden's normal behavior is to always tell A the fate of B, whatever that may be. Then A's chances are 1/2.
What is happening here is the warden is being asked to tell A the name of one dying prisoner other than A. He obliges, and his behavior is as follows: If A and B are going to die: the warden says "B will die". If A and C are going to die: the warden says "C will die". If B and C are going to die: the warden picks one of them at random and says that prisoner will die.