Does the Number of Matching Socks Follow a Poisson distribution

274 Views Asked by At

A large number of distinct pair of socks are in a drawer, all mixed up. A small number of individual socks are removed. Explain in general terms why it might be plausible to assume that the number of pairs among the socks removed might follow a Poisson distribution.

I don't know how to relate the Poisson distribution to this particular problem.

Because basically a Poisson distribution counts the events that occur randomly in a given interval of time.

Could someone give help me with this??

Thanks.

2

There are 2 best solutions below

3
On BEST ANSWER

This will have to be a rough approximation. The Poisson distribution assumes each event is independent of others. To minimize the chances of 'interference' among events (matching pairs), let's suppose there are $N = 100$ pairs of socks in the drawer (200 individual socks). Also, suppose we withdraw $k = 10$ individual socks.

Among $10$ socks there are ${10 \choose 2} = 45$ possible pairings. And for every two socks drawn there is 1 chance in roughly 199 of a match. So, sweeping lots of little difficulties under the rug, the average number of pairs among 10 socks should be about $\frac{45}{199} = 0.226.$ Thus if the model is to be Poisson it should have mean $\lambda \approx 0.226.$

An important potential difficulty with a Poisson model is that probabilities change as socks get 'used up' in sampling without replacement. So 'matches' are not exactly independent of each other. Also, the model probably works best for even numbers $k$ of socks taken out of the drawer.

Notes: You did not say anything about checking whether a Poisson model is realistic. But I am an applied statistician and I can't resist the idea of checking.

(a) If you are interested in checking exactly how well a Poisson model works, you can search this site for 'socks pairs drawer' and maybe you will find exact solutions for various numbers of matches. I will leave this to you in case you want to try it.

(b) Another way of judging whether a Poisson model is reasonable is to simulate what actually happens in practice. Below I show a simulation to illustrate that a search for specific exact probabilities is not entirely futile. I used R statistical software to simulate a million sock drawing experiments with $N = 100$ pairs of socks in a drawer of which $k = 10$ individual socks are selected at random without replacement. The vector drawer in the R code contains numbers $(1,1, 2,2, 3,3, \dots, 100,100)\;$ for a hundred numbered pairs of socks. On each passage through the loop the $i$th element of vector x records the number of matching pairs in the $i$th performance of the experiment.

N = 100;  k = 10;  drawer = rep(1:N, each=2)
m = 10^6;  x = numeric(m)
for(i in 1:m) {
  out = sample(drawer, k)
  x[i] = k-length(unique(out)) }
mean(x);  var(x)
## 0.226054    # aprx E(X)
## 0.2069158   # aprx Var(X)
# Histogram with Dots for Poisson Probabilities
lw = min(x)-.5;  up = max(x)+.5
hist(x, prob=T, br=lw:up, col="skyblue2")
y = 0:max(x); pdf=dpois(y, mean(x))
points(y, pdf, pch=19)

The fact that the simulation approximated $E(X) = 0.226$ is promising. Also, for Poisson $X$ one has $V(X)$ numerically equal to $E(X).$ And the simulation approximated $V(X) = 0.207$ (slightly less reliably). The near equality of these approximations is promising. Moreover, the black dots on the histogram below represent probabilities in the distribution $\mathsf{Pois}(\lambda = 0.226),$ and they match the histogram bars pretty well.

enter image description here

6
On

Poisson distribution represents the probability of occurrence of an event a specified number of times in a given time interval or over regions of space that are not too large.

It is well known that Poisson distribution can be used to model the number of typos in a randomly chosen page of a book. It is easy observe that a large number of pages(say $f_0$) have no typing errors, a much smaller number of pages(say, $f_1$, compared to $f_0$) have just one typing error etc.

In the present example, a small number of individual socks are removed and we are interested in the number of pairs among the socks removed. I hope, you can reason now why the number of pairs among the socks removed (similar to the number of typing errors ) in a small number of individual socks removed(randomly chosen page in a book ) might follow a Poisson distribution.