Minimum number of samples

Question

Minimum number of samples

290 Views Asked by Bumbble Comm At 07 Apr 2026 - 7:28

First, I'd like to apologyze just in case the question is too simple or not too well explained. Also I'm not trying to get an specific answer (as I understand it's not too defined), just some information to start my research.

I have a set of data. This set involves different numerical records from people that we want split in different groups. These groups aim to be more specific than the previous one, and join people with similar characteristics.

GENERAL DATA
=============
P1 400
P2 355
P3 255
...
PN 650
=============
SAMPLES: 5000
AVG: 322

FILTERED DATA (only some records can be inside this group)
=============
P1 400
P3 255
...
PX 122
=============
SAMPLES: 455
AVG: 245

If we have 5000 samples in our data, one of these segmentations could have just 60. What I'd like to know is what are the minimum number of samples that we need to be sure that the number of samples in the group represent the reality in some way (I mean, I have samples enough to have some statistical power).

I undestand that I have to find work related to 'Sample size determination' but I don't really get what kind of statistical distribution I'm working with.

What I really need is to know the the average in the group is representative.

Thanks a lot!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

In statistics, if you want to make inference about characteristics of the population ($N=5000$) based on small samples ($n=455$), then two things play a role. Sample size, what you are worried about, and variance of the data. Hence, if there is very little variance in the data, then you only need a very small sample size to be confident that the sample average is representative of the population. On the other hand, if your data has a lot of variance, then you will need larger sample sizes.

This isn't very helpful from a practical point of view, but this is the main trade-off. If you are familiar with significance/confidence levels, then what you might want to do is compute confidence intervals. These give you some upper and some lower bound for your sample mean $\bar{x}=1/n \sum_i P_i$, i.e., they give you an interval where the population mean likely is, based on the data of your sample. For example, if your sample of $n=455$ has a mean of $\bar{x}=245$, then a 95% confidence interval may be $[245-30,245+30]$ (this depends on the data). The smaller (narrower) the interval, the more accurately your sample tells you something about the population. The width of the confidence interval depends, as mentioned above, on the sample size (the more the smaller the interval) and variance of the data (the more the wider the interval).

If you are not familiar with confidence intervals, you will have to rely on rules of thumb. 10% of the population is good, provided that you draw your sample randomly. This is the most crucial point in this endeavor. Other than that, more is better.

Minimum number of samples

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Trending Questions

Popular # Hahtags

Popular Questions