White/black toys are placed into boxes. %75 has high white chance (rest black). Majority of boxes have white>black toys probability more than 1/2?

81 Views Asked by At

I am an economic theorist working on deterministic models almost exclusively. I have an undergrad degree in mathematics and as a hobby (due to time constraints, not as much as I would like to) I try to read/watch material on pure mathematics.

In a model I was working on a few years back, I stumbled upon a probability question that I could not solve. Sadly I don't have any Probability Theorists in my close network so as a last resort I wanted to try my chance here and ask for reading material which might help me progress towards at least a partial solution to the following problem.

Two more things to note: I think the problem in its general form stated is quite hard and would require serious effort to solve. I would appreciate suggestions on which program(s) can be used to get simulations. (They need to be as simple as possible since only programming experience I have comes from a course I took on Matlab.)

Apologies if there is a category within Probability Theory this question belongs to that I am not aware of.

The question is the following:

  • We receive an odd number of boxes (let $2c+1$ denote the number of boxes).
  • Each of the boxes has an equal number of toys in them (let $2d+1$
    denote the number of toys in each box).Let $D_{i}$ denote the set of toys in box $i\in \{1,\ldots ,2c+1\}$. For any two boxes $i,j\in \{1,\ldots ,2c+1\}$ , $\left\vert D_{i}\right\vert =\left\vert D_{j}\right\vert =2d+1$. So $2d+1$ is the number of toys in any given box and the total number of toys in the boxes is $\left( 2c+1\right) \left( 2d+1\right) $.

  • Out of the $\left( 2c+1\right) \left( 2d+1\right) $ toys, some are made in country $A$, and the rest are made in country $B$. Let $N^{A}$ denote the set of toys made in country $A$, and similarly let $N^{B}$ denote the set of toys made in country $B$. $\left\vert N^{A}\right\vert +\left\vert N^{B}\right\vert =\left( 2c+1\right) \left( 2d+1\right) $.

  • We know the ratio of toys made in country $A$, but we do not know how the entire set of toys are distributed among the boxes (so the distribution into boxes is arbitrary but fixed). Specifically, we are given the information that: $\frac{\left\vert N^{A}\right\vert }{\left\vert N^{A}\right\vert +\left\vert N^{B}\right\vert }\geq \frac{3}{4}$

  • Each toy is either black or white, given any probability $p\in (\frac{1}{2},1]$ we have the following information: Each toy made in Country $A$ is white with probability $p$ and black with probability $1-p$. Symmetrically, each toy made in Country $B$ is white with probability 1-p and white with probability $p$. So country $A$ made toys are more likely to be white than black, and the inverse for country $B$ toys.

  • The question I am interested in (I think this is a hard problem and all I am looking for is references to read): Prove that for any $p\in(\frac{1}{2},1]$; no matter what the distribution of toys into boxes is, probability of majority of boxes to be a white-box (a box which contains more white toys than black) is at least $\frac{1}{2}$. (Thanks to @Joroki for the comment asking for clarification.)

To clarify the question further:

Given a box $i\in \{1,\ldots ,2c+1\}$, $\left\vert D_{i}^{A}\right\vert $ denotes the number of toys in box $i$ that are made in country $A$, and similarly $\left\vert D_{i}^{B}\right\vert $ denotes the number of toys in box $i$ that are made in country $B$.

Let $\Phi \left( \left\vert D_{i}^{A}\right\vert ,\left\vert D_{i}^{B}\right\vert ,p\right) $ denote the probability that a majority of toys in box $i$ are white, given the number of toys in box $i$ that are made in country $A$, $\left\vert D_{i}^{A}\right\vert $, the number of toys in box $i$ that are made in country $B$, $\left\vert D_{i}^{B}\right\vert $, and the probability that a toy made in country $A$ is white, $p\in (\frac{1}{% 2},1]$.

We have:

$\Phi \left( \left\vert D_{i}^{A}\right\vert ,\left\vert D_{i}^{B}\right\vert ,p\right) =\sum\limits_{m=d+1}^{2d+1}\left( \sum\limits_{% \begin{array}{c} k+l=m \\ 0\leq k\leq \left\vert D_{i}^{A}\right\vert \\ 0\leq l\leq \left\vert D_{i}^{B}\right\vert \end{array}% }\binom{\left\vert D_{i}^{A}\right\vert }{k}\cdot \binom{\left\vert D_{i}^{B}\right\vert }{l}\cdot p^{k}\cdot \left( 1-p\right) ^{\left\vert D_{i}^{A}\right\vert -k}\cdot \left( 1-p\right) ^{l}\cdot p^{\left\vert D_{i}^{B}\right\vert -l}\right) $

Let $\Psi \left( \left\vert D_{1}^{A}\right\vert ,\left\vert D_{1}^{B}\right\vert ,\left\vert D_{2}^{A}\right\vert ,\left\vert D_{2}^{B}\right\vert ,\ldots ,\left\vert D_{2c+1}^{A}\right\vert ,\left\vert D_{2c+1}^{B}\right\vert ,p\right) $ denote the probability that a majority of toys in a majority of boxes is white given distribution of toys made in each country into boxes, and the probability that a toy made in country $A$ is white, $p\in (\frac{1}{2},1]$,

I would like suggestions on how to attempt the following problem:

[$\frac{\left\vert N^{A}\right\vert }{\left\vert N^{A}\right\vert +\left\vert N^{B}\right\vert }\geq \frac{3}{4}$ and $p\in (\frac{1}{2},1)$ ] $\Longrightarrow $ For any $ \left\vert D_{1}^{A}\right\vert ,\left\vert D_{1}^{B}\right\vert ,\left\vert D_{2}^{A}\right\vert ,\left\vert D_{2}^{B}\right\vert ,\ldots ,\left\vert D_{2c+1}^{A}\right\vert$

$\Psi \left( \left\vert D_{1}^{A}\right\vert ,\left\vert D_{1}^{B}\right\vert ,\left\vert D_{2}^{A}\right\vert ,\left\vert D_{2}^{B}\right\vert ,\ldots ,\left\vert D_{2c+1}^{A}\right\vert ,\left\vert D_{2c+1}^{B}\right\vert ,p\right) \geq \frac{1}{2}$

I don't know if $\Phi$ or $\Psi$'s distribution has a special name. If there is a different way to address the same problem, I would really appreciate that as well.

SOME SIMPLE OBSERVATIONS:

  • By Pigeonhole Principle, a majority of boxes will have more country $A$ toys than country $B$ ones.Thus $p=1$ case trivial.
  • I conjecture this to be true because I have shown the following (note that by symmetry it suffices to consider only $\frac{1}{2}\leq \frac{|N^{A}|}{|N^{A}|+|N^{B}|}$) : \begin{equation*} \frac{1}{2}\leq \frac{|N^{A}|}{|N^{A}|+|N^{B}|}<\frac{3c+1}{2\left( 2c+1\right) } \end{equation*}

There exists $p\in \left( \frac{1}{2},1\right] $ such that $\psi \left( |D_{1}^{A}|,|D_{1}^{B}|,|D_{2}^{A}|,|D_{2}^{B}|,\cdots ,|D_{2c+1}^{A}|,|D_{2c+1}^{B}|\text{,}p\right) <\frac{1}{2}$. To see why just consider $p=1$ and consider the distribution where you fill last box with $A$ toys only, than second and so on , you fill $k$ boxes this way. We choose the minimum $k$ such that for the remaining $2c+1-k$ if we distribute the remaining $\left(2c+1-k\right)\left(2d+1\right)$ toys majority is $B$ toys. Then we distribute these toys to empty $2c+1-k$ as uniformly as posssible, check that each of boxes will have majority $B$ toys (for our purposes do do not worry about divisibility, assume $d$ is large enough). And $\Psi=0$.

  • I tried fixing a $p$ and looking for the distribution(s) that would minimize $\Psi$, but could not find articles on that.

  • I tried learning more about Chernoff bounds and getting some partial results from there, I did not get anything noteworthy.

  • I would be happy with a partial result for a case like if $p>\frac{3}{5} $ above conclusion holds.

P.s. Please do not hesitate to ask any clarification questions, thank you for reading.