Suppose we model a distribution over $27$ alphabet symbols with the random variable $\mathbf{X} $, a vector which takes a multinomial distribution, parameterized by $\mathbf{\theta}=(\theta_1, ..., \theta_{27})$ where $\sum_{i=1}^{27} \theta_i = 1$. Let $\mathbf{X}\textbf{[i]} = (X_1 \textbf{[i]}, X_2\textbf{[i]}, ... X_{27}\textbf{[i]})$ denote the $i$th sample observed.
We use a Dirichlet prior over $\theta$, with parameter $\mathbf{\alpha}= (\alpha_1, ..., \alpha_{27})$, where each $\alpha_i = 10$
$$ \xi (\theta) = \frac{\Gamma(\sum_{i=1}^{27} \alpha_i)}{\Pi_{i=1}^{27}\Gamma(\alpha_i)} \Pi_{i=1}^{27}\theta_i^{\alpha_i-1} $$
After observing $2000$ samples, of which $100$ are "a" and $87$ are "P", I have that the posterior is a Dirichlet distribution over $\mathbf{\theta}$ with parameters $\small \mathbf{b} = \mathbf{\alpha} + \mathbf{X} = (b_1 = \alpha_1 + 100, b_2 = \alpha_2 + X_2, ..., b_{15} = \alpha_{15} + 87, ... b_{27} = \alpha_{27} + X_{27})$.
That is, $ P(\mathbf{\theta}|\mathbf{b}\textbf{[1]},\mathbf{b}\textbf{[2]}, ... \mathbf{b}\textbf{[2000]}) $
$$=\int_{\theta} \frac{\Gamma(\sum_{i=1}^{27} \alpha_i + 2000 )}{\Gamma(b_1)\Gamma(b_{15})\Gamma(\sum_{i=1}^{27} \alpha_i + 2000- (b_1 + b_{15}))} \theta_{1}^{b_1-1}\theta_{15}^{b_{15}-1}...\theta_{27}^{b_{27-1}} \mathrm d\theta $$
$$=\int_{\theta} \frac{\Gamma(2270 )}{\Gamma(110)\Gamma(97)\Gamma(2063)} \theta_{1}^{110-1}\theta_{15}^{97-1}...\theta_{27}^{b_{27}-1} \mathrm d\theta $$
My question is how to find the probability that the next two samples ($i=2001, 2002$) are the symbols "a" and "p" , given $\mathbf{X}\textbf{[1]}...\mathbf{X}\textbf{[2000]}$.
i.e., how to find $P(\mathbf{X_1}\textbf{[2002]} = a, \mathbf{X_{15}}\textbf{ [2001]} = p \mid \mathbf{X}\textbf{[1]}...\mathbf{X}\textbf{[2000]})$.
Assuming $\mathbf{X}\textbf{[i]}$ is conditionally independent of $ \mathbf{X}\textbf{[i+1]}$ given $\theta$, is the following headed toward the solution?
$$P(\mathbf{X_1}\textbf{[2002]} = a, \mathbf{X_{15}}\textbf{ [2001]} = p \mid \mathbf{X}\textbf{[1]}...\mathbf{X}\textbf{[2000]}) $$ $$= \int_{\theta} P(\mathbf{X_1}\textbf{[2002]} = a, \mathbf{X_{15}}\textbf{ [2001]} = p \mid \theta)P(\mathbf{\theta}|\mathbf{X}\textbf{[1]},\mathbf{X}\textbf{[2]}, ... \mathbf{X}\textbf{[2000]}) \mathrm d\theta $$ $$= \int_{\theta} P(\mathbf{X_1}\textbf{[2002]} = a \mid \theta)P( \mathbf{X_{15}}\textbf{ [2001]} = p \mid \theta)P(\mathbf{\theta}|\mathbf{X}\textbf{[1]},\mathbf{X}\textbf{[2]}, ... \mathbf{X}\textbf{[2000]}) \mathrm d\theta $$
$$= \int_{\theta} \theta_{1} \theta_{15} P(\mathbf{\theta}|\mathbf{X}\textbf{[1]},\mathbf{X}\textbf{[2]}, ... \mathbf{X}\textbf{[2000]}) \mathrm d\theta $$
$$=\int_{\theta} \theta_{1} \theta_{15} \frac{\Gamma(2270 )}{\Gamma(110)\Gamma(97)\Gamma(2063)} \theta_{1}^{110}\theta_{15}^{97}...\theta_{27}^{b_{27}} \mathrm d\theta $$
Here is how to solve it. $\mathbf{X}_{2001}$ may be conditionally independent of $\mathbf{X}_{2002}$ given $\theta$, but $\mathbf{X}_{2001},\mathbf{X}_{2002}$ are not conditionally independent given $\mathbf{X}$.