The following is the problem 34(page 198) of Bertsekas' Introduction to Probability, 2nd edition:
A defective coin minting machine produces coins whose probability of heads is a random variable $P$ with PDF:
$$f_P(p)= \begin{cases} pe^p & 0\le p\leq 1 \\ 0 & \text{otherwise} \end{cases} $$ A coin produced by this machine is selected and tossed repeatedly, with successive tosses assumed independent.
(a) Find the probability that a coin toss results in heads.
(b) Given that a coin toss resulted in heads, find the conditional PDF of $P$.
(c) Given that a frst coin toss resulted in heads, fnd the conditional probability of heads on the next toss.
The statement of this problem is unclear to me. For example, what does he mean by "whose probability of heads is a random variable $P$"? My goal is not just to solve it(actually there is a solution here) but also understand this problem in a rigorous way. So my question is, is there a way using measure theory or axiomatic probability theory to interpret this problem?
TL;DR Consider an uncountable set of people whose names $N_p$ are indexed by what they think is the probability of heads i.e. Person $N_p$ claims a probability of $p$, with a probability of being right (or credibility) $pe^p$. Then we have something like
$$\mathbb P(H) = \int_0^1 \mathbb P(\text{Heads} | N_p \text{is right}) \mathbb P(N_p \text{is right}) dp$$
$$ = \mathbb P(H) = \int_0^1 \mathbb P(\text{Heads} | N_p \text{is right}) )\text{credibility of} N_p) dp$$
Let's say Alice tells you the probability of heads is $\frac13$ but Bob tells you the probability of heads is $\frac12$. If Alice is right one fourth of the time and Bob is right the rest of the time, then we 'expect' the probability of heads to be
$$\frac13\frac14 + \frac12\frac34$$
This is based on the law of total probability (see here; following Wiki notation, the $B_n$'s can then be 'person $n$ is right'):
$$\mathbb P(H) = \mathbb P(H| \text{Alice is right})\mathbb P(\text{Alice is right}) + \mathbb P(H| \text{Bob is right})\mathbb P(\text{Bob is right})$$
Observe that we computed an expectation: what we 'expect' the probability to be is influenced by the credibility of Alice and Bob. For example, if Alice has no credibility (I'm assuming credibility is zero sum I guess), there's nothing (more) to 'expect': The probability is $\frac12$.
So while we still won't know what the next coin toss will be, we know that that if we're playing a game where we get $+1$ for heads and $-1$ for tails, then we expect to get $(+1)(\frac12) + (-1)(\frac12) = 0$ on average rather than not being sure if we get $(+1)(\frac12) + (-1)(\frac12) = 0$ or $(+1)(\frac13) + (-1)(\frac23) = -\frac13$
Analogy: Alice and Bob may be two different analysts telling you the probability that a certain stock goes up: One says $\frac13$ while another says $\frac12$. But what do you think? What you think is the probability may depend on the respective credibilities of the analysts eg one analyst has a master's while the other has only a bachelor's, but the latter analyst has more work experience.
(Said probability may depend also on several factors such as historical data, financial ratios, inside information etc, but I guess we may assume that a lot of those are already taken into account by the analysts except hopefully the inside information stuff and even some personal biases such as the latter analyst having forgotten your birthday or your being an Android user while the stock is Apple).
Therefore, the credibility of an analyst is (or should be) reflected in what you believe is the probability that said analyst is right and ultimately what you believe is the probability a stock will go up.
(I think this may have something to do with credence or the Bayesian interpretation of probability)
Now what if instead of Alice and Bob you have more than two people? (I'm going to assume said people all give you different probabilities and are exhaustive i.e. no one claims differently, the latter of which may I guess be replaced with anyone who claims differently would have no credibility)
Case 1. What if we have $n$ people whose names are $N_1, ..., N_n$? Then
$$\mathbb P(H) = \sum_{i=1}^{n} \mathbb P(H|N_i \ \text{is right})\mathbb P(N_i \ \text{is right})$$
Note that there may be someone else named $N_{n+1}$ but following the assumptions, $N_{n+1}$ either:
has the same claim as someone in $\{N_1, ..., N_n\}$ i.e. $\{N_{n+1} \ \text{is right}\} = \{N_{i} \ \text{is right}\}$ for some $i = 1, ..., n$
has no credibility i.e. $P(N_{n+1} \ \text{is right}) = 0$
makes no claims in which case we vacuously have $\{N_{n+1} \ \text{is right}\} = \emptyset$
Case 2. What if we have countably infinite people whose names are $\{N_i\}_{i \in \mathbb N}$? Then
$$\mathbb P(H) = \sum_{i=1}^{\infty} \mathbb P(H|N_i \ \text{is right})\mathbb P(N_i \ \text{is right})$$
Of course, we didn't have to use $\mathbb N$. We could have used any countably infinite set.
Similar reasoning to above applies to a hypothetical person named $N_{6.7}$
Case 3. What if we have uncountably infinite people whose names are $\{N_i\}_{i \in I}$ where $I$ is uncountable? Then
$$\mathbb P(H) = \int_I \mathbb P(H|N_i \ \text{is right})\mathbb P(N_i \ \text{is right}) di ... \text{or something like that}$$
The point is that in this case and in the problem, we use an integral.
Now let's go to the problem, we have an uncountably infinite set of people $\{N_p\}$ who are indexed/identified by their (distinct?) claims of the probability of heads, and it turns out that their credibility is related to their claimed probability: The person, $N_p$, claims a probability of $p$ and has credibility is $pe^p$.
Notes:
$pe^p$ is an increasing function so credibility is proportional to claimed probability, in this particular case. It isn't realistic of course: An analyst who claims the probability a stock will go up 90% isn't more credible than one who claims 20%.
We may not have exhaustion: There may be people whose names are $N_p$ where $p \notin [0,1]$, but the probability that they any of them are right is zero (Donald Trump included?) because we have defined them to have credibility $0$. For example, the person named $N_{17}$ says something about the probability of heads, namely $17$, but $N_{17}$ has no credibility.
$N_0$ has no credibility: $0e^0 = 0$
$N_1$ has the most credibility: $1e^1 = e$
Penultimately,
$$\mathbb P(H) = \int_{[0,1]} \mathbb P(H|N_p \ \text{is right})\mathbb P(N_p \ \text{is right}) dp$$
$$= \int_{[0,1]} p \mathbb P(N_p \ \text{is right}) dp$$
$$= \int_{[0,1]} p pe^p dp$$
$$= e - 2$$
Finally, what about in terms of the random variable $P$?
Let $P$ be a continuous random variable with [insert pdf here]. Then
$$\mathbb E [P] = \mathbb P (H)$$
Hopefully that's clear intuitively. As for rigour, I guess we could go back to just Alice and Bob:
By the law of total probability, we have
$$\mathbb P(H) = \mathbb P(H| \text{Alice is right})\mathbb P(\text{Alice is right}) + \mathbb P(H| \text{Bob is right})\mathbb P(\text{Bob is right})$$
By the law of total expectation (see here), we have
$$\mathbb E [P] = \mathbb E [P| \text{Alice is right}]\mathbb P(\text{Alice is right}) + \mathbb E [P| \text{Bob is right}]\mathbb P(\text{Bob is right})$$
where for a random variable $X$ and event $A$
$$E[X|A] = \frac{E[X1_A]}{P(A)}$$
where $1_A(\omega) = 1$ if $\omega \in A$ and $1_A(\omega) = 0$ if $\omega \in A^C$.
So convince yourself that
$$\mathbb E [P| \text{Alice is right}] = \mathbb P(H| \text{Alice is right})$$
Additional:
$$\mathbb P (P \in [0,1]) = \int_{[0,1]} pe^p dp = 1$$
$$\mathbb P (P \in \{p^*\}) = \int_{\{p^{*}\}} pe^p dp = 0$$
$$\mathbb P (P \in I) = \int_{I} pe^p dp$$
$$\mathbb E[P | AB] = \frac{\mathbb E[P 1_{AB}]}{\mathbb P(AB)}$$
$$= \frac{\mathbb E[P 1_{P \in I}]}{\mathbb P(P \in I)}$$
$$= \frac{\int_{[0,1]} p1_{p \in I} (pe^p) dp}{\int_I pe^p dp}$$
$$= \frac{\int_I p(pe^p) dp}{\int_I pe^p dp}$$
$$\mathbb E[P | \Omega] = \frac{\mathbb E[P 1_{\Omega}]}{\mathbb P(\Omega)}$$
$$= \frac{\mathbb E[P 1_{P \in [0,1]}]}{\mathbb P(P \in [0,1])}$$
$$= \frac{\int_{[0,1]} p(pe^p) dp}{\int_{[0,1]} pe^p dp}$$
$$= \int_{[0,1]} p(pe^p) dp$$
$$= e - 2$$