I write some code to generate random weights (I define low as 50Kg and high as 100Kg) of males, then generate 100 samples containing 100 measurements (weights per sample) i.e. 100 samples and each sample contains 100 weights
The mean of weights for each one of the 100 samples are:
Output:
[73.43, 73.96, 74.3, 76.68, 77.3, 75.43, 75.72, 73.27, 76.36, 73.72, 74.63, 74.2, 76.43, 71.48, 76.47, 75.74, 75.71, 74.14, 72.48, 74.53, 73.17, 74.75, 73.79, 75.03, 74.54, 74.65, 73.16, 74.84, 75.86, 73.69, 74.76, 74.83, 73.53, 73.95, 73.14, 76.37, 75.36, 74.62, 73.38, 76.84, 74.88, 73.58, 71.71, 75.24, 77.96, 75.38, 77.56, 73.85, 73.25, 75.09, 74.24, 75.12, 72.09, 74.63, 75.35, 72.76, 73.44, 74.82, 74.08, 72.91, 74.99, 75.56, 75.37, 75.27, 72.16, 74.5, 73.65, 73.1, 70.72, 74.92, 72.61, 72.7, 75.01, 72.75, 77.05, 73.29, 73.02, 74.9, 75.8, 75.88, 74.42, 76.83, 73.85, 72.9, 74.45, 73.86, 70.95, 75.05, 74.04, 73.84, 75.52, 73.52, 73.01, 75.01, 73.49, 73.74, 77.05, 74.87, 74.99, 74.17]
I then calculate for that dataset:
- Mean: 74.4299
- Standard deviation: 1.4121603981134718
- Standard error: 0.14121603981134717
- Confidence_Interval_0.95: (74.1482851277224, 74.7115148722776)
I try to understand what those numbers represent (Standard deviation, Standard error and Confidence_Interval_0.95) in a practical way.
The Standard error is the Standard deviation/10 (the sqrt of sample size). But what exactly represents? The Standard deviation shows how the Means are spread out, what about the Standard error? What is the difference?
The 95% rule of a normal distribution shows that 95% of the values lie in 2 Standard deviations from the Mean but:
Mean + 2*(Standard deviation) NOT EQUAL Confidence_Interval_0.95: 74.7115148722776.
So what is exactly the relationship between those statistics?
Thanks