The following result was given in the research article (claim 6) and no justification was provided for the proof. I have presented the claim in simple terms below. Basically, I want to understand what theorems were used and how to prove the following two (To prove) parts.
We have two devices $\mathscr{A}$ and $\mathscr{B}$ far apart. $\mathscr{A}$ takes input $x_i \in \{0,1,2\}$ and $\mathscr{B}$ takes input $y_i \in \{0,1\}$. They both give random outputs $a_i, b_i \in \{0,1\}$ respectively for $i$th input.
Uniformly $m$ random input pairs are selected $$I=\{ (x_i,y_i) \mid x_i \in \{0,1,2\}, y_i \in \{0,1\} \text{ for } 1\leq i \leq m\}$$ and outputs are collected in the set $\mathbb{O}$.
Let $C$ denotes the set of inputs such that $(x_i,y_i) = (2,1)$ i.e., $C=\{ (x_i,y_i) \mid x_i = 2, y_i = 1 \}$.
At random for $0 < \gamma <1$, a total of $\gamma m$ $\in \mathbb{N}$ inputs are selected from $I$. Lets denote this set as $B$.
To Prove 1: The randomly chosen set $B$ contains a fraction of at least $\gamma /2$ fractions from the elements of the set $C$, except with probability at most $e^{- \gamma |C| /8}$.
Let $C_B$ denotes the elements of $C$ in the set $B$. If for $C_B$ the outputs satisfies $a_i \neq b_i$ for at most $\eta$ fractions of times. Then
To Prove 2: With probability at least $1-e^{- \gamma |C| /200}$ the total fraction of rounds in $C$ such that $a_i \neq b_i$ is at most $1.1 \eta$.
See this answer for the inequalities we'll need: if $X$ is either binomial or hypergeometric with mean $\mu$, then:
The first bound
This matters, because in the first question, $|B \cap C|$ is a hypergeometric random variable: we are taking $\gamma m$ samples from $I$ without replacement, and the probability of success on an individual trial is $p = \frac{|C|}{m}$. (If we were sampling with replacement, the result would be binomial, and the Chernoff-type results for binomials are much more well-known.) The inequality above tells us $$ \Pr[|B\cap C| \le (1-\delta)\mu] \le e^{-\frac12 \delta^2 \mu}. $$ We are taking $\delta = \frac12$ here, and $\mu = \gamma |C|$, so the upper bound is $e^{-\gamma|C|/8}$.
The second bound
I don't believe $e^{-\gamma |C|/200}$ is possible with the given assumptions, but here's what we can do instead.
Let $D = \{i : a_i \ne b_i\}$. The case we want to avoid is that $|C \cap D| \ge 1.1\eta |C|$ (there are many rounds in $C$ such that $a_i \ne b_i$) but $|B \cap C \cap D| \le \eta |B\cap C|$ (the sample $B$ doesn't notice).
Without further assumptions, we can't say how large $C \cap D$ is, and it's possible that $\Pr[i \in D] > 1.1\eta$, in which case $|C\cap D| \ge 1.1\eta|C|$ is likely. However, we can say that if $|C\cap D| \ge 1.1\eta |C|$, then $|B \cap C \cap D|$ is very likely to also be large.
Let $\gamma'$ be the constant such that $|B\cap C| = \gamma'|C|$; given the event we've already bounded, $\gamma' \ge \frac12\gamma$. Then $B\cap C$ is equally likely to be any subset of $C$ of size $\gamma'|C|$: we are taking $\gamma'|C|$ samples from $C$ without replacement. If a "success" is picking an element of $C \cap D$, then the probability of success is $\ge 1.1\eta$, so the expected number of successes is at least $1.1\eta \cdot \gamma'|C|$. By the Chernoff bound we have, $$ \Pr[|B \cap C \cap D| \le \eta \gamma'|C|] \le e^{-\frac12 \cdot (\frac1{11})^2 \cdot 1.1\eta \cdot \gamma'|C|} $$ which is at most $e^{-\eta \gamma |C|/440}$.
This is worse than the given bound $e^{-\gamma|C|/200}$ in two ways. First, there is a factor of $2$ coming from the bound $\gamma' \ge \frac12\gamma$. But that comes from our first argument, which has a much stronger error term $e^{-\gamma|C|/8}$. The right factor to pick could be more like $\gamma' \ge 0.9\gamma$, which will make the first error term worse, but the second error term better.
Second, there is a dependency on $\eta$ in addition to $\gamma$, which I don't think can be eliminated. I've already said this in the comments, but let me repeat the example: suppose $\Pr[a_i \ne b_i]$ and $\eta$ are both very small, say $\Pr[a_i \ne b_i] = \eta = \frac1{|C|}$. Then as $|C| \to \infty$, $|C\cap D|$ approaches a $\text{Poisson}(1)$ distribution; for example, $|C\cap D| = 2$ with probability about $\frac 2e$. However, $B$ is very likely to miss both elements of $C \cap D|$: the probability of this is roughly $(1-\gamma)^2$. For large $|C|$, $\frac2e \cdot (1-\gamma)^2$ is much larger than $e^{-\gamma|C|/200}$ (but not larger than $e^{-\eta \gamma|C|/200}$).