sample size calculator for a poisson distribution?

1k Views Asked by At

This question is related to research for Alzheimer's disease. I am a Professor of Neuroscience: while I know how to make genetically modified mice, I have limited knowledge of the underlying statistics.

Transgenic mice overexpressing the A-beta protein develop in average ca. 8'000 "plaques" in their brains. The plaques are a surrogate marker of the intellectual deterioration occurring in Alzheimer's disease. I have a method that allows me to count all plaques present in the brain of a mouse (which is a technical feat, believe me!).

Now, I am testing a few treatment options. Specifically, I wish to find out if my treatment reduces significantly the number of plaques. I do not know the inter-individual variability of this number yet, but I will find out soon. Let's assume it's 5% S.E.M. The goal is to reduce the plaque load by at least 40%.

How can I use a calculator, or a python or R script, to estimate how many mice I need to have in the treated and control groups in order to establish non-futility of the treatment with a 5% or 1% (or 0.1%) confidence?

1

There are 1 best solutions below

1
On BEST ANSWER

In my note, I made it clear I may not understand the problem, but if you are expecting a 60% reduction in plaque, it should not take many mice in each group to get a strong result.

Below is a simulation in R with 1000 fake datasets. Each has 5 mice in two groups. Trial means are 8000 and 5000. I have induced considerable variation beyond the variation inherent in Poisson counts. Then I did Welch (separate-variances) t tests on each dataset. Essentially all of the P-values were very low. At the end, I printed out the last of the 1000 simulated datasets. These may be a little hard to read because I don't know how to use monospace font in this system.

Please understand this is an outrageously speculative first try. Comments needed.

m = 1000; p.val = numeric(m); n = 5

for (i in 1:m) {

  • lam1 = 8000 + rnorm(n, 0, 400); x1 = rpois(n, lam1)

  • lam2 = 5000 + rnorm(n, 0, 300); x2 = rpois(n, lam2)

  • p.val[i] = t.test(x1, x2)$p.value }

mean(p.val < .05)

[1] 1

mean(p.val < .01)

[1] 1

mean(p.val)

[1] 3.089001e-05


cbind(x1, x2)

   x1   x2

[1,] 9073 5159

[2,] 7677 5406

[3,] 7895 4839

[4,] 8813 4798

[5,] 7536 4731

Notice that all of the numbers in group 2 are smaller than any of the numbers in group 1. Is that even remotely realistic? If your actual data turn out anything like this, five mice in each group would certainly suffice. There might be tests other than a 2-sample t that are theoretically better, but with such datasets as these, the details won't matter.