I would like to compute the probability for some subset $\omega \subset \Omega$ of events to occur, i.e. $P(\omega) = \sum_{x \in \omega} P(x)$ where I know $P(x)$ for all $x \in \Omega$, which are exponentially large sets.
For what it's worth, I already have Markov Chain Monte Carlo simulation set up for events within $\omega$, but computing $P(\omega)$ boils down to computing the normalization constant of that Markov Chain which is far from straightforward.
I am thinking that being able to generate all probabilities $P(x)$ should simplify this problem significantly. It seems to be a problem that must be quite ubiquitous but could not find any references. The most naive solution would be just to uniformly sample points $x_i$ in $\Omega$ and then sum $\sum_{i:x_i \in \omega} P(x_i)$. But this is clearly always $< P(\omega)$ and only converges to the exact value if all of $\Omega$ is sampled. I am having a hard time mapping this problem to usual MC problems because we are not strictly computing an expectation value here (i.e. we are not taking a mean of with respect to some distribution). Would appreciate any insights. Thank you!
EDIT: For clarification, in my problem I have bitstrings of a certain length $n$ (they constitute $\Omega$), and with each string we associate a certain probability $p$ which can be computed at low cost. So I can uniformly sample the space of all bitstrings $\Omega$ and associated probabilities. And I have constructed a Markov chain for events in $\omega$ alone. How do I compute $P(\omega)$?
For future reference, here are two approaches based on the comments that seem to work both. Note that (2) seems much more efficient.
Build a Markov Chain for all events in $\Omega$ and then compute the expectation value $\langle I_\omega \rangle$ of the indicator function $I_\omega(X) = 1$ if $X \in \omega$, and $=0$ else. Disadvantage of this is that it takes a long time to converge because we need to sample all of $\Omega$, and we are not making use of the fact that we already know the normalized probabilities.
I had actually looked into this earlier, but must have implemented this wrong in the first try -- we can use sampling only over $\omega$: $$ P(\omega) = \sum_{x \in \omega} P(x) = \sum_{x\in \omega} \frac{P(x)}{Q(x)} Q(x) \equiv \langle \frac{P(x)}{Q(x)} \rangle_Q $$ where $Q(x)$ is some other probability distribution on $\omega$. Because it is cheap for me to uniformly sample $\omega$, I can pick $Q(x) \equiv 1/|\omega|$. Ideally I would like to use a more optimal $Q(x)$ (ideally $Q(x) \approx P(x)$) but I think the only way to sample from this $Q(x)$ directly is to use a Markov Chain algorithm, which however does not give me access to Q(x).