How to untangle multiple, mixed Bernoulii distributions

Question

How to untangle multiple, mixed Bernoulii distributions

89 Views Asked by Bumbble Comm At 29 Mar 2026 - 4:09

In my universe there are a infinite number of opaque boxes that are either:

empty
has a red marble
has a blue marble
has both a red and a blue marble

p(r) gives probability that there is a red marble in the box p(b) gives probability that there is a blue marble in the box

These probabilities are independent. In addition, I have two keys, K(r) (red key), and K(b), (blue key). A key of a given color may open the box ONLY if marble of that color is in the box. What is tricky is that keys are not reliable at opening the boxes, on a given attempt

p(Kr) gives the probability that red key will open the box if red marble present
p(Kb) gives the probability that blue key will open the box if blue marble present
when a box is opened w/ either red or blue key we CAN NOT tell if the other color marble is present (meaning the other key also would have eventually worked); ability to learn about presence or absence of other color marble is lost once the box is opened.

Again, these key probabilities are independent. My universe is constructed from these 4 probabilities p(r), p(b), p(Kr), p(Kb) but I don't know these values. These are the values that I am trying to determine by repeatedly trying my 2 keys on (infinite) supply of boxes that I do have. My strategy:

Pick a box
repeatedly try red/blue keys randomly till either the box opens and I record which color key opened it.
give up after 20 (or some other number) of attempts and pick a new box
if successful or not I would have full record of exact sequence of keys that I have attempted

Based on a large number of observations I would like to know if I can correctly determine my 4 probability values.

I wrote a simulation and attempted a few things. When I have a success I count the number of failed key attempts of the successful key before its success. This gives an estimate of P(Kx) (where x is r or b), but estimates seem biased and tend to be too high. This is I believe because I discard information regarding failed attempts w/ the other key before a given key does succeed. But I don't know how to use the information regarding failures.

Any hints welcome, I have tried to read up on "bernoulli mixture models", but have not found a solution.

Below I share some results from a simulation that I have written and played around with.

p(r)=p(b)=0.7
p(Kr)=p(Kb)=0.5

Attempted 10,000 boxes with max of 20 attempts on each box.

overall success rate 0.9102 (expected 0.91)
red key successful 4446 times, blue key 4656 times
Overall red key failures, 17,998. Blue key failures 17,575
Key attempts in failed cases (giving up after 20 events)
- red key attempted 9091 times, blue attempted 8869 times
data on red key successes
- 4446 successes (noted above)
- 3294 red key failures before success was seen
- 5225 blue key failures before red key has success
data on blue key successes
- 4656 blue key successes (noted above)
- 3481 blue key failures before success was seen
- 5613 red key failures before blue key has success

From these I can ATTEMPT to determine p(Kr) and p(Kb). I get:

p(Kr)=4446./(4446 + 3294) = 0.5744186

similarly

p(Kb)=4656./(4656+3481) = 0.5722010

both of which are higher than 0.5 that I had expected. And I am not close to guessing p(r) and p(b) yet. All hints welcome.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2016-11-07 22:12:58

Based on a large number of observations I would like to know if I can correctly determine my 4 probability values.

You can, with the following algorithm.

Draw a box, $i$
Try the blue key $Q$ times or until it opens the box.
If it opens the box, store the number of trials it took, $X_{i,b}$, and that the box contained a blue marble, $B_{i, b} = 1$
If it does not open the box after $Q$ attempts, no blue marble is in it with high probability (larger Q makes it more likely, but smaller key success probability makes it less likely) . Either a red marble is in it, or none.
Try the red key $Q$ times or until it opens the box.
If it opens the box, store the number of trials it took, $X_r$, and that the box contained a red marble $B_{i, r} = 1 $.
If it doesn't it's empty (with high probability).

Repeat this process $S$ times. Compute the estimators for the 4 probabilities as follows:

$\widehat{\Pr (b)} = S^{-1} \sum_i^S B_{i,b}$

$\widehat{\Pr (r)} = S^{-1} \sum_i^S B_{i,r}$

For the key success probabilities, use the maximum likelihood estimator for the parameter of a geometric distribution:

$\widehat{\Pr (Kb)} = \frac{n_b}{ \sum_i X_{i,b} }$

$\widehat{\Pr (Kr)} = \frac{n_r}{ \sum_i X_{i,r} }$

Where $n_b$ and $n_r$ are the number of data points you have, ie the number of times you repeatedly attempted to open the box with each key until finally succeeding. The $X_{i, color}$ are the number of trials until success it took for each of those.

If the key success probabilities are likely small, $Q$ needs to be large to avoid upwards biasing their estimators and downwards biasing the marble colour estimators. In your example these are large so $Q=20$ as you have is enough. But it obviously won't be when the success probability is eg $1/1000$

How to untangle multiple, mixed Bernoulii distributions

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in PARAMETER-ESTIMATION

Trending Questions

Popular # Hahtags

Popular Questions