Estimating Conditional Probability With I.I.D Samples

45 Views Asked by At

I have a sequence of 0's and 1's that are produced in order. I would like to estimate the probability of observing a 1 (event $A$) given that there were at least 2 1's in the previous 4 observations (event $B$). I want $P(A|B)$. (Note that I cannot solve this without estimation as the underlying data generating process is unknown).

Currently, I am estimating this probability by counting all the instances where there are at least 2 1's in any continuous 4 bits (m), and the instances where they are followed by a 1 (n). Then, $P(A|B) = \frac{n}{m}$

Is there anything wrong with my methodology? Could there be an issue with the I.I.D-ness of the samples when a few bits overlap in the counting process?

Let me further elaborate with an example. Assume we are given the following snapshot from an infinite sequence:

$....,1,1,1,1,0,0,0,1,1,1,0,1,....$.

Then, out of the 11 samples of the condition (..,1,1), (.,1,1,1), (1,1,1,1), (1,1,1,0), (1,1,0,0), (1,0,0,0), (0,0,0,1), (0,0,1,1), (0,1,1,1), (1,1,1,0), (1,1,0,1), 9 of them satisfy the condition, and 4 have an output of 1.

So, m = 9, and n = 4. Hence, the probability is $\frac{4}{9}$.

Is this way of estimating the probabilities correct? Am I violating the I.I.D. ness of the samples?