Why does Simpsons Paradox Happen in Real Life?

99 Views Asked by Bumbble Comm At 26 Mar 2026 - 2:19

Here is a famous example of Simpsons Paradox about Kidney Stone treatments (https://en.wikipedia.org/wiki/Simpson%27s_paradox):

     StoneSize    TreatmentA    TreatmentB
 Small stones   93% (81/87) 87% (234/270)
 Large stones 73% (192/263)   69% (55/80)
         Both 78% (273/350) 83% (289/350)

As we can see here, Treatment A works better than Treatment B for both small stones and large stones, yet Treatment B has a higher overall average compared to Treatment A.

I am interested in knowing if there is a mathematical proof to understand why this paradox happens and what kinds of situations results in this paradox happening more than often and more strongly (i.e. Treatment A dominates Treatment B by larger margins on the individual level, yet Treatment B dominates A by an equally large margin).

Here is a Python simulation in which simulates random clinical trials in which the effectiveness of Treatment A and Treatment B are tested on large and small kidney stones, and the prevalence of Simpsons Paradox is tested in each set of simulations.

In each simulation, the probability of treatment working is Uniform~(0,1), thus in n people, np will have successful treatments:

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(123)

sample_sizes = np.array([87, 270, 263, 80])


percentage_simpsons_paradox = []

# Run the simulation 
for i in range(100):
    
    simpsons_paradox_count = 0

  
    for j in range(100):
       
        numerators_A = np.random.binomial(sample_sizes[:2], np.random.rand(2))
        numerators_B = np.random.binomial(sample_sizes[2:], np.random.rand(2))

      
        success_rate_A = numerators_A / sample_sizes[:2]
        success_rate_B = numerators_B / sample_sizes[2:]

        
        if all(success_rate_A > success_rate_B) and (np.sum(numerators_B) / np.sum(sample_sizes[2:])) > (np.sum(numerators_A) / np.sum(sample_sizes[:2])):
            simpsons_paradox_count += 1

    
    percentage_simpsons_paradox.append(simpsons_paradox_count / 100 * 100)


plt.plot(range(1, 101), percentage_simpsons_paradox)
plt.title("Percentage of Simpson's Paradox Occurrences in Each Iteration")
plt.xlabel("Iteration Number")
plt.ylabel("Percentage of Simpson's Paradox Occurrences")
plt.show()

As we can see here, under these simulation condition/assumptions, Simpsons Paradox happens roughly 2.5% of the time.

But is there some mathematical proof which can show how frequently we can expect Simpsons Paradox to occur in a given simulation situation (i.e. simulation with certain conditions, parameters of success having certain probability distribution) ... and which types of simulation situations will produce more extreme and/or more frequent manifestations of Simpsons Paradox?

Original Q&A

Why does Simpsons Paradox Happen in Real Life?

Related Questions in PROBABILITY

Related Questions in PARADOXES

Trending Questions

Popular # Hahtags

Popular Questions