Statistically quantifying a variable with a limited number of samples

Question

Statistically quantifying a variable with a limited number of samples

82 Views Asked by Bumbble Comm At 30 Mar 2026 - 2:05

Regarding a variable which at any time can have one of two values, but which we only have a limited number of samples for, I'd like to be able to make a statement along the lines of:

With 95% certainty, x is 1 80% of the time

The properties of x are that it can change state at any time. Rapid sampling is not an option and edge detection is not an option. Periodic sampling may give a misleading view if the signal itself is in any way periodic, so the correct thing to do seems to me to be random sampling. Which is what prompted me to post this question.

Given an infinite number of samples, the confidence in the estimate of the distribution between the two states becomes 100%. Given a single sample, the confidence will obviously be close to (or actually) zero. What is the function that equates number of samples to level of confidence?

I have N samples, therefore the confidence is x%

I want x% confidence, therefore I need N samples

I'm aware that similar sounding questions have been asked, but the replies very quickly ascend into terminology and notation that is beyond my faded statistics knowledge.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2016-02-23 21:19:41

Let me try to answer without "ascending" too far.

Call one state 'Success' and the other 'Failure', Then the probability of 'Success' on any one of $n$ occasions in random sampling is $\theta.$

If we observe $X$ successes in $n$ trials then the estimate of $\theta$ is $\hat \theta = X/n$. For large $n$, the traditional 95% confidence interval for $\theta$ is $$\hat \theta \pm 1.96\sqrt{\hat \theta(1-\hat\theta)/n}.$$ From there, one can find the $n$ required for the 'margin of error' $E = 1.96\sqrt{\hat \theta(1-\hat\theta)/n}$ to be of a desired size. For example, if if $\hat \theta \approx 1/2$ then $E \approx 1/\sqrt{n},$ so it takes a sample size of $n \approx 1100$ to get $E \approx .03 = 3\%.$ Pollsters often use this formula for the margin of sampling error in a public opinion poll even for $.3 < \hat \theta < .7.$

For example, if $n = 300$ and $X = 134,$ then $\hat \theta = 134/300 = 0.446 = 44.6\%$. You cannot say $\theta = 44.6\%$ exactly, but a 95% CI for $\theta$ is $\hat \theta \pm E,$ which computes to the interval $(39.0\%, 50.3\%).$ A larger sample size $n$ would tend to give a shorter interval.

For smaller sample sizes (say less than a few hundred), more accurate 95% CIS can be obtained by using adjusted values $\hat \theta^+ = (X +2)/(n+4)$ and $n^+ = n + 4$ in the formula displayed above.

For example, if $n = 50$ and $X = 23$ then $\hat \theta^+ = 25/54 = 46.3%$ and the (adjusted) CI is the relatively wide interval $(33.0\%, 59.6\%)$ on account of the smaller sample size.

Note: I stress that the discussion above applies to random sampling from a stable population or process. You are wondering whether the value of $\theta$ may fluctuate according to some unspecified process, perhaps periodically. With such a vague specification of the problem, there is no guarantee that this answer (or any other one) can be reliably accurate.

Statistically quantifying a variable with a limited number of samples

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in SAMPLING

Trending Questions

Popular # Hahtags

Popular Questions