Finding the standard deviation from data grouped by intervals

Question

Finding the standard deviation from data grouped by intervals

1k Views Asked by user52950 At 02 Apr 2026 - 4:47

Starting Monthly Salary	Number of Graduates
1,001 - 1,400	1
1,401 - 1,800	11
1,801 - 2,200	14
2,201 - 2,600	38
2,601 - 3,000	36
Total	100

What I have done, is averaged out each value in Monthly Salary column and continuing from there by calculating the deviation at the end

Calculating all 100 of these values is tedious:

(1001 + 1400) / 2 = 1200.50

(1401 + 1800) / 2 = 1600.50

Continued for all 100 values to the calculate the deviation for each...

Then calculating the standard deviation with σ=√((Σ(x-µ)^2)/2). This formula once everything has been entered in will look really nasty. Is there an easier (or cleaner) solution to finding the (approximate) standard deviation of the monthly starting salaries above?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 27 Dec 2020 - 11:17

$$\sqrt{ \Sigma_i x_i^2 p(x_i)-(\Sigma_i x_ip(x_i))^2}$$

$x_i $	$p(x_i)$	$xp(x_i)$	$x^2p(x_i) $
1,200.5	0.01	12.01	14,412.00
1,600.5	0.11	176.06	281,776.03
2,000.5	0.14	280.07	560,280.04
2,400.5	0.38	912.19	2,189,712.10
2,800.5	0.36	1,008.18	2,823,408.09
Total	1.00	2,388.50	5,869,588.25

$$\sigma=\sqrt{5,689,588.25-2388.5^2}=405.78$$

**Bumbble Comm** · Accepted Answer

Your grouped data have midpoints $m_i$ with respective frequencies $f_i.$

m = c(12,16,20,24,28)*100
f = c(1,11,14,38,36)

The sample mean is approximately $A =\bar X = \frac 1n\sum_{i=1}^5 f_im_i =2388,$ where $n = \sum_{i=1}^5 f_i = 100.$ Using R as a calculator:

n = sum(f); n
[1] 100
a = sum(f*m)/n; a
[1] 2388

The sample variance $S^2 \approx \sum_{i=1}^5 \frac{1}{n-1}f_i(m_i-\bar X)^2 =166\,319.2$ and the sample standard deviation is $S =\sqrt{S^2} = 407.82.$

v = sum(f*(m-a)^2)/(n-1);  v
[1] 166319.2
s = sqrt(v);  s
[1] 407.8225

If you are using some kind of spreadsheet, there might be a built-in function for finding the mean and variance of a column of $n = 100$ numbers. If so, you could find exact values of the sample mean and standard deviation. (Some information is lost when data are put into groups and summarized.)

 x = rep(m, times=f)  # 'data' reconstructed from m & f
 mean(x);  sd(x)
 [1] 2388
 [1] 407.8225

 cutp=seq(1000, 3000, by=400)
  hist(x, br=cutp, ylim=c(0,45), 
       col="skyblue2", label=T)

Finding the standard deviation from data grouped by intervals

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in STANDARD-DEVIATION

Trending Questions

Popular # Hahtags

Popular Questions