Probability of two events occurring in same time period across time-series data

309 Views Asked by At

I haven't studied probability mathematics for a long time. However, some time-series data came my way and I'm trying to ascertain the likelihood of two events happening during the same time intervals. For example:

  • 100 time series units (days)
  • Event A happens on 70 of the days
  • Event B happens on 5 of the days
  • If an event happens on a given day, it won't happen again that same day. So 70 of the 100 days will have an event A. Neither can double up in other words.
  • Assume likelihoods are random within the 100 time units and both events are independent

Q: What is the probability that all event B occurrences are on days where event A also happens?

I think this is where I want to take the inverse and look at the 1 - P(B in days A not happen), i.e., the other 30.

I apologize if this is trivial. Examples I've found are all draw with/without replace style ones and I don't know if that applies here. I did find this here but I'm wary of this being overly complicated in comparison to my case. Again, for the purposes of explanation, I'm assuming the events are independent, though in reality there's likely a positive correlation between them.

2

There are 2 best solutions below

1
On BEST ANSWER

There are $\binom{100}5$ sets of 5 days that could be the event B days. Of these, $\binom{70}5$ are subsets of the event A days. If all $\binom{100}5$ possible sets are equally likely, then the desired probability is $\binom{70}5/{\binom{100}5}$.

This is equal to $\frac{70}{100}\frac{69}{99}\frac{68}{98}\frac{67}{97}\frac{66}{96}$, which is approximated by Ethan's answer $\left(\frac{70}{100}\right)^5$.

1
On

If everything is as independent as you claim then the probability that any particular $B$ occurs when there's an $A$ is $70/100$. Then the probability that they all occur on $A$ days is $0.7^5 \approx 0.16807$.

That assumes that multiple $A$s can happen on the same day, with a similar assumption for the $B$s - "sampling with replacement". The calculation with the opposite assumption is a little trickier. The answer is probably not much different.

Edit in response to comment.

OK so the $A$s come on $70$ of the $100$ days - it doesn't matter which. Then the probability that the first $B$ is on an $A$ day is $70/100$. Now that day can't have another $B$, so the probability that the second $B$ is on an $A$ day is $69/99$. And so on. The probability that all the $B$s are on $A$ days is $$ \frac{70}{100} \frac{69}{99} \frac{68}{98} \frac{67}{97} \frac{66}{96} \approx 0.16076 . $$