The check associativity problem is:
Input. A binary operator $*$ on a set $X$ of size $n$.
Output. "Yes" if $*$ is associative (i.e., $\forall{i, j, k \in X}, (i*j)*k = i*(j*k)$); "No" otherwise.
The randomized algorithm given below is due to [RS96].
- Uniformly at random, pick 3 sets $R, S, T$ from the powerset $2^X$.
- if $(R * S) * T \neq R * (S * T)$: output "No"
- else output "Yes" (with one-sided error).
If the algorithm outputs "No", it is always correct. We want to bound the probability that the algorithm makes a mistake if $X$ is non-associative.
If $X$ is non-associative then there must exist at least one triple $i', j', k'$ such that $(i'*j')*k' \neq i'*(j'*k')$.
We want to bound the probability of the event that $X$ is non-associative but we output "Yes". Formally, we want to prove the following claim:
Claim. If $*$ is not associative, then at least 1/8 of the possible triples $R$, $S$, $T$ are witnesses, i.e., $$\Pr[(R * S) * T = R * (S * T)] \leq \frac{7}{8}.$$
I am writing out each claim of the analysis so it is easier to identify my mistake:
We make a mistake if we happen to pick triples $R$, $S$, and $T$ such that $i' \notin R$, $j' \notin S$, OR $k' \notin T$.
If we pick a set u.a.r. from $2^X$ then the probability that any fixed element is in that set is 1/2. (Why? This is because each element is in exactly half the sets in $2^X$.)
$\Pr[(R * S) * T = R * (S * T)] \leq \Pr[i' \notin R \; \lor \; j' \notin S \; \lor \; k' \notin T]$. (Inequality is because there could be other non-associative triples.)
Then: $$ \begin{align} \Pr[i' \notin R \; \lor \; j' \notin S \; \lor \; k' \notin T] &= 1 - \Pr[i' \in R, j' \in S, k' \in T] \\ &= 1 - \Pr[i' \in R] \cdot \Pr[j' \in S] \cdot \Pr[k' \in T] \\ &= 1 - \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} \\ &= \frac{7}{8} \end{align} $$ where the second step is because $R$, $S$, and $T$ are selected independently.
The more standard analysis is given here. Can anyone please help me understand where I went wrong?
[RS96] S. Rajagopalan and L. Schulman, "Verifying identities." Proceedings of 37th IEEE Symposium on Foundations of Computer Science (FOCS), 1996, pp. 612–616.