'Physical Therapy Review' [Intention to treat analysis, compliance, drop-outs and how to deal with missing data in clinical research: a review Susan Armijo-Olivo, Sharon Warren and David Magee Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, Canada] suggests a number of ways to account for drop-outs in trials, but remark that none are beyond criticism.
I wondered whether this method would be good, and I'm interested in critique:
If you have an experiment with D valid data points and O drop outs, and the D valid data points have a mean of M and a standard deviation of σ then:
You can, but shouldn't, assume that M and σ are similar for O - there's a reason for the drop-outs that might make them significantly different.
If we look at the extreme with D = 0 and O = 1, M=0 and σ = 0 - but the probability of the result being accurate, valid or reliable, R = 0. To reflect this, we could say that M=0 and σ(corrected) = big integer (a massive variance).
At the other extreme, with D = N (a very large number >10000, say) and O=0, M=m1 and σ=s1, with R -> 1 (say .95 or so). We can, fairly safely, say that σ(corrected) = s1.
A real study is somewhere in between. A mid-point of a distribution of studies would be:
D=N/2 O=N/2 M=m2 σ=s2 it seems reasonable to say that R ≈ .5. What can we say about σ? Well, it is very unlikely be s2, but it's not likely to be as massive a difference as in case 1. Would it be reasonable to set σ(corrected) = σ + (2σ / m ) - the reasoning being that the unreliability is likely to be in the order of sigma relative to the mean?
If so, then for arbitrary D and O,
σ(corrected) = σ + ( O / (D+O))(σ / m)