Let $\varpi$ be a Dirichlet process on $[0,1]$ with concentration parameter $\varepsilon$ and base measure $\alpha$, where $\alpha$ is a Beta distribution with parameters $\alpha_0$ and $\alpha_1$. Given $\varpi$, let $\{p_n\}_{n=1}^\infty$ be i.i.d. with distribution $\varpi$. Given $\{p_n\}_{n=1}^\infty$, let $\{X_n(j):n,j\in\mathbb{N}\}$ be independent with the property that for each fixed $n$, the sequence $\{X_n(j)\}_{j=1}^\infty$ is i.i.d. with $P(X_n(j)=0)=1-P(X_n(j)=1)=p_n$.
Heuristically, we are modeling a box of bent coins. The $n$-th coin has probability $p_n$ of coming up heads, and $\{X_n(j)=0\}$ is the event that the $j$-th flip of the $n$-th coin lands heads. The overall distribution of $p$-values in the box is described by $\varpi$. This distribution, $\varpi$, is unknown to us and we are modeling it with a Dirichlet process. That is, $\varpi\sim DP(\alpha,\varepsilon)$, where $\alpha$ is the probability measure on $[0,1]$ with $\alpha(dx)\propto x^{\alpha_0-1}(1-x)^{\alpha_1-1}\,dx$.
My question is this: What is the conditional distribution of $p=(p_1,\ldots,p_N)$ given $D=\{X_n(j)=i_{n,j}:n\le N, j\le T_n\}$? In other words, if we observe $T_1$ flips of the 1st coin, $T_2$ flips of the 2nd, and so on down to $T_N$, what can we say about the underlying $p$-values, $p_1,\ldots,p_N$? According to my back-of-the-envelope calculations, if $\beta^n_0=|\{j:i_{n,j}=0\}|$ and $\beta^n_1=T_n-\beta^n_0$, then \[ P(p_n\in A_n,\forall n\le N \mid D) = \frac{E\left[{\prod_{n=1}^N \int_{A_n} x_n^{\beta^n_0}(1-x_n)^{\beta^n_1}\,\varpi(dx_n)}\right]} {E\left[{\prod_{n=1}^N \int_{[0,1]} x_n^{\beta^n_0}(1-x_n)^{\beta^n_1}\,\varpi(dx_n)}\right]}. \] I was hoping for a formula in terms of $\alpha$, and I know that $E[\int\varphi(x)\,\varpi(dx)]=\int\varphi(x)\,\alpha(dx)$, but unfortunately, I do not think the above expectations and products commute. Does anyone have any further information or references related to this model, and in particular, does anyone have a simpler expression for the conditional distribution of $p$ given $D$?
Edit: A little more heuristic reasoning leads me to conjecture something along these lines: \[ P(X_{n_0}(T_{n_0}+1)=0 \mid D) = \sum_{\substack{J\subset\{1,\ldots,N\},\\ n_0\in J}} P(p_n = p_{n_0}\text{ iff }n\in J \mid D) \frac{\alpha_0 + \sum_{n\in J}\beta^n_0} {\alpha_0 + \alpha_1 + \sum_{n\in J} T_n}. \] In other words, the formula is using the usual posterior probability of heads given the $\text{Beta}(\alpha_0,\alpha_1)$ prior, but instead of using only the flips of the $n_0$-th coin, it uses all the flips of subsets of coins, weighted by the probability that the coins in that subset share the same $p$-value. Has anyone seen anything like this?
The short answer is that you are essentially on the right track, and that this stuff has been studied extensively in the statistics and machine learning communities that I'm familiar with (and I would guess more so in the probability literature). See here and here for exampels of it being worked on in the Gaussian-Gaussian setting (rather than Beta-Bernoulli) It is also related to the theory of distributions on partitions, which is of interest to biologists, the Dirichlet process in particular being related to the Ewens distribution.
What you are looking at is a Dirichlet process mixture - this one in particular (Beta-Bernoulli) is featured in this paper by Liu. The posterior is intractable in the sense that - as you noted - it is a mixture over all possible partitions of the data. Using the relationship between the Dirichlet process and Chinese resturant process, the probability of a given partition $\mathcal P$ is given by (e.g. in the papers by Liu and Lo) $$ f(\mathcal P) = \frac{\Gamma(\epsilon)\epsilon^{|\mathcal P|}}{\Gamma(\epsilon + N)} \prod_{b \in \mathcal P} \Gamma(|b|), $$ while, given the partition $\mathcal P$, the data has the joint $$ f(X | \mathcal P) = \prod_{b \in \mathcal P} \int_0^1 \mbox{Beta}(p \mid \alpha_0, \alpha_1) \cdot p^{\sum_{n \in b, j \le T_n} X_n(j)} \\ \cdot (1 - p) ^ {\sum_{n \in b, j \le T_n} 1 - X_n(j)}, $$ which can be calculated in closed form.
The marginal posterior of $p_1, ..., p_n$ is the mixture of all of the posteriors you would get if you knew the underlying partition, weighted by (1) the prior probability of that subset and (2) the coherence of the partition (i.e. how similar are the $X_n$'s that are grouped together, given by $f(X | \mathcal P)$) $$ f(p | X) = \sum_{\mathcal P} \frac{f(\mathcal P)f(X | \mathcal P)}{\sum_{\mathcal Q} f(\mathcal Q)(f(X|\mathcal Q)} f(p | X, \mathcal P), $$ where the final term $f(p | X, \mathcal P)$ is, by conjugacy, a product of the updated Beta distributions. If you are interested in a particular $p_n$ you can try summing out the remaining $p_j$ to get the expression you conjectured, but people typically stop here. It isn't obvious to me that the weight in your mixture will be nicer than the weights for $f(p | X)$.
Granted this isn't the statistics forum, but if you are actually interested in conducting inference in this setting, the posterior is obviously intractable and people typically use MCMC or variational methods in practice. The paper by Liu uses SIS, which is not typically used in practice for this problem, but there is a vast literature on how to do inference here. This is a good review paper for MCMC methods.