Simpson's Paradox: Inequality Equivalence

72 Views Asked by At

I have a question regarding a paper dealing with Simpson's paradox. The article can be viewed here:

Copositive Matrices and Simpson's Paradox

In the article it is hinted that inequality (1) is equivalent to inequality (4).

I fail to understand why these two inequalities the same.

To my understanding inequality (4) states that:

$\sum P(AB\cap C_{i}) \cdot \sum P(\bar{A}\bar{B}\cap C_{i})\leq \sum P(A\bar{B}\cap C_{i}) \cdot \sum P(\bar{A}B\cap C_{i})$

While inequality (1) states that:

$P(A|B)\leq P(\bar{A}|B)$

A conditional probability versus an intersection of events.

My question: Are inequalities (1) and (4) identical or equivalent ? If so, why ? If not, why ?

Thank you !

1

There are 1 best solutions below

0
On BEST ANSWER

I strongly suspect there are transcription errors in the paper's inequalities $(1)$ and $(2)$, which were probably intended to be \begin{align} (1')\hspace{3em}P(\,B\,|\,A\,)&\le P\big(\,B\,\big|\,\bar{A}\,\big)\\(2')\hspace{2em}P\big(\,B\,\big|\,A C_i\,\big)&>P\big(\,B\,\big|\,\bar{A} C_i\,\big) \ \text{ for }\ i=1,2,\dots,n\ . \end{align} As given, the paper's inequalities $(1)$ and $(2)$ are are impossible to satisfy simultaneously, and aren't proper conditions for any version of Simpson's paradox. Inequalities $(1')$ and $(2')$, on the other hand, are a statement of one version of Simpson's paradox, and inequality $(1')$ is equivalent to the paper's inequality $(4)$ ( at least whenever $\ 0<P(A)<1\ $, which is a necessary condition for the conditional probabilities in the inequalities to be meaningful). Here's a proof of the equivalence: \begin{align} &\frac{P(BA)}{P(A) }=P(\,B\,\big|\,A\,)\le P\big(\,B\,\big|\,\bar{A}\,\big)=\frac{P\big(\,B\,\bar{A}\,\big)}{P\big(\bar{A}\,\big)}\\ \iff&P(BA)P\big(\bar{A}\,\big)\le P\big(\,B\,\bar{A}\,\big)P(A)\\ \iff&P(BA)\big(P\big(B\bar{A}\,\big)+P\big(\bar{B}\bar{A}\,\big)\big)\le P\big(\,B\,\bar{A}\,\big)\big(P(BA)+P\big(\bar{B}A\,\big)\big)\\ \iff&P(BA)P\big(\bar{B}\bar{A}\,\big)\le P\big(\,B\,\bar{A}\,\big)P\big(\bar{B}A\,\big)\ . \end{align} The final inequality here is just a simplified version of your expression for the inequality $(4)$, since $\ \sum_\limits{i=1}^nP\big(X\cap C_i\big)=$$\,P(X)\ $ for any event $\ X\ $.

On the other hand, if you multiply the inequality $(1)$, as given in the paper, by $\ P(B)\ $ you get $$ (1'')\hspace{3em}P(A)\le P\big(\bar{A}\big)\ . $$ For the conditional probabilities $\ P\big(A\ \big|\,BC_i\,\big)\ $ and $\ P\big(\bar{A}\ \big|\,BC_i\,\big)\ $ appearing in the paper's inequalities $(2)$ to be meaningful, the inequalities $\ P\big(BC_i\big)>0\ $ must be satisfied for all $\ i\ $ . Therefore \begin{align} &P\big(A\ \big|\,BC_i\,\big)>P\big(\bar{A}\ \big|\,BC_i\,\big) \ \text{ for }\ i=1,2,\dots,n\\ \Rightarrow&P(A)=\sum_{i=1}^nP\big(A\ \big|\,BC_i\,\big)P\big(BC_i\big)\\ &\hspace{1.7em}>\sum_{i=1}^nP\big(\bar{A}\ \big|\,BC_i\,\big)P\big(BC_i\big)\\ &\hspace{1.7em}=P\big(\bar{A}\big)\ , \end{align} which contradicts $(1'')$, and hence $(1)$.

I suspect that inequalities $(1')$ and $(2')$ are what were in an original correct version of the paper, but somewhere during the process of publication, the locations of the arguments $\ A,\bar{A}\ $ got swapped inadvertently with those of $\ B\ $.

For what it's worth, the inequality $(4)$ is certainly not equivalent to the inequality $(1)$ given in the paper, although that's largely irrelevant for understanding what seems to have gone wrong. It's easy to assign probabilites to $\ AB\ $, $\ \bar{A}B\ $, $\ A\bar{B}\ $ and $\ \bar{A}\bar{B}\ $ so that inequality $(1'')$ is satisfied but the supposedly equivalent inequality $$ P(BA)P\big(\bar{B}\bar{A}\,\big)\le P\big(\,B\,\bar{A}\,\big)P\big(\bar{B}A\,\big) $$ is not. The assignment $$ P(BA)=\frac{3}{32},\,P\big(B\bar{A}\big)=\frac{5}{32},\,P\big(\bar{B}A\big)=\frac{1}{8},P\big(\bar{B}\bar{A}\big)=\frac{5}{8}\ , $$ for example, achieves this.