Question on Chernoff bound type probability argument.

77 Views Asked by At

The following result was given in the research article (claim 6) and no justification was provided for the proof. I have presented the claim in simple terms below. Basically, I want to understand what theorems were used and how to prove the following two (To prove) parts.


We have two devices $\mathscr{A}$ and $\mathscr{B}$ far apart. $\mathscr{A}$ takes input $x_i \in \{0,1,2\}$ and $\mathscr{B}$ takes input $y_i \in \{0,1\}$. They both give random outputs $a_i, b_i \in \{0,1\}$ respectively for $i$th input.

Uniformly $m$ random input pairs are selected $$I=\{ (x_i,y_i) \mid x_i \in \{0,1,2\}, y_i \in \{0,1\} \text{ for } 1\leq i \leq m\}$$ and outputs are collected in the set $\mathbb{O}$.

Let $C$ denotes the set of inputs such that $(x_i,y_i) = (2,1)$ i.e., $C=\{ (x_i,y_i) \mid x_i = 2, y_i = 1 \}$.
At random for $0 < \gamma <1$, a total of $\gamma m$ $\in \mathbb{N}$ inputs are selected from $I$. Lets denote this set as $B$.

To Prove 1: The randomly chosen set $B$ contains a fraction of at least $\gamma /2$ fractions from the elements of the set $C$, except with probability at most $e^{- \gamma |C| /8}$.

Let $C_B$ denotes the elements of $C$ in the set $B$. If for $C_B$ the outputs satisfies $a_i \neq b_i$ for at most $\eta$ fractions of times. Then

To Prove 2: With probability at least $1-e^{- \gamma |C| /200}$ the total fraction of rounds in $C$ such that $a_i \neq b_i$ is at most $1.1 \eta$.

1

There are 1 best solutions below

0
On BEST ANSWER

See this answer for the inequalities we'll need: if $X$ is either binomial or hypergeometric with mean $\mu$, then:

  • for any $\delta \in [0,1]$, we have $\Pr[X \ge (1+\delta)\mu] \le e^{-\frac13 \delta^2\mu}$;
  • for any $\delta>0$, we have $\Pr[X \le (1-\delta)\mu] \le e^{-\frac12 \delta^2\mu}$.

The first bound

This matters, because in the first question, $|B \cap C|$ is a hypergeometric random variable: we are taking $\gamma m$ samples from $I$ without replacement, and the probability of success on an individual trial is $p = \frac{|C|}{m}$. (If we were sampling with replacement, the result would be binomial, and the Chernoff-type results for binomials are much more well-known.) The inequality above tells us $$ \Pr[|B\cap C| \le (1-\delta)\mu] \le e^{-\frac12 \delta^2 \mu}. $$ We are taking $\delta = \frac12$ here, and $\mu = \gamma |C|$, so the upper bound is $e^{-\gamma|C|/8}$.

The second bound

I don't believe $e^{-\gamma |C|/200}$ is possible with the given assumptions, but here's what we can do instead.

Let $D = \{i : a_i \ne b_i\}$. The case we want to avoid is that $|C \cap D| \ge 1.1\eta |C|$ (there are many rounds in $C$ such that $a_i \ne b_i$) but $|B \cap C \cap D| \le \eta |B\cap C|$ (the sample $B$ doesn't notice).

Without further assumptions, we can't say how large $C \cap D$ is, and it's possible that $\Pr[i \in D] > 1.1\eta$, in which case $|C\cap D| \ge 1.1\eta|C|$ is likely. However, we can say that if $|C\cap D| \ge 1.1\eta |C|$, then $|B \cap C \cap D|$ is very likely to also be large.

Let $\gamma'$ be the constant such that $|B\cap C| = \gamma'|C|$; given the event we've already bounded, $\gamma' \ge \frac12\gamma$. Then $B\cap C$ is equally likely to be any subset of $C$ of size $\gamma'|C|$: we are taking $\gamma'|C|$ samples from $C$ without replacement. If a "success" is picking an element of $C \cap D$, then the probability of success is $\ge 1.1\eta$, so the expected number of successes is at least $1.1\eta \cdot \gamma'|C|$. By the Chernoff bound we have, $$ \Pr[|B \cap C \cap D| \le \eta \gamma'|C|] \le e^{-\frac12 \cdot (\frac1{11})^2 \cdot 1.1\eta \cdot \gamma'|C|} $$ which is at most $e^{-\eta \gamma |C|/440}$.

This is worse than the given bound $e^{-\gamma|C|/200}$ in two ways. First, there is a factor of $2$ coming from the bound $\gamma' \ge \frac12\gamma$. But that comes from our first argument, which has a much stronger error term $e^{-\gamma|C|/8}$. The right factor to pick could be more like $\gamma' \ge 0.9\gamma$, which will make the first error term worse, but the second error term better.

Second, there is a dependency on $\eta$ in addition to $\gamma$, which I don't think can be eliminated. I've already said this in the comments, but let me repeat the example: suppose $\Pr[a_i \ne b_i]$ and $\eta$ are both very small, say $\Pr[a_i \ne b_i] = \eta = \frac1{|C|}$. Then as $|C| \to \infty$, $|C\cap D|$ approaches a $\text{Poisson}(1)$ distribution; for example, $|C\cap D| = 2$ with probability about $\frac 2e$. However, $B$ is very likely to miss both elements of $C \cap D|$: the probability of this is roughly $(1-\gamma)^2$. For large $|C|$, $\frac2e \cdot (1-\gamma)^2$ is much larger than $e^{-\gamma|C|/200}$ (but not larger than $e^{-\eta \gamma|C|/200}$).