Calculate sample variance from tallied data

93 Views Asked by At

How friends, I am studying Statitics on my own and I am looking for someone to promp me to understand how to approach questions of this nature. I am used to frequency tables but not this one. The table below shows demand for a product in a span of 310 days.

----------------------------------------------+

Product demand 5 | 6 |7 |8 | 9 | 10 |

----------------------------------------------+

No.Of.Days. 20 |60 |80 |120 | 20 | 10 |

----------------------------------------------+ How can I calculate the sample variance of the above product demand data?

1

There are 1 best solutions below

1
On BEST ANSWER

I take it that the demand value 5 occurs on 20 days; demand value 6 on 60 days; and so on.

Then the frequencies $f_1 = 20,\; f_2 = 60,$ and so on, apply to the values $v_1 = 5,\; v_2 = 6,$ and so on. In that case, the sample size is $n = \sum_{i=1}^6 f_i,\;$ the sample mean is $\bar X = (1/n) \sum_{i=1}^6 f_i v_i.$ and the sample variance is $$s^2 = \frac{1}{n-1}\sum_{i=1}^6 f_i(v_i - \bar X)^2.$$

You may find similar formulas for approximating $\bar X$ and $s^2$ from a frequency histogram, where $f_i$ is the height of the $i$th bar and $m_i = v_i$ is the midpoint of the $i$th bar. Thus the assumption behind the approximation is that all observations in the interval for the $i$th bar are located at the midpoint of that interval.

As a check, you might try using your summarized data to make a 6-bin histogram. Then use the formula above to find $\bar X.$ Does it seem that the histogram 'balances' at $\bar X ?$

The following brief session in R statistical software illustrates some of these concepts.

 v = c(5, 6, 7, 8, 9, 10)
 f = c(20, 60, 80, 120, 20, 10)
 n = sum(f);  n
 ## 310
 x.bar = sum(f*v)/n;  x.bar
 ## 7.290323
 var = sum(f*(v-x.bar)^2)/(n-1);  var
 ## 1.307026

 x = rep(v, times=f);  x
    [1]  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  5  6  6  6  6  6
   [26]  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6
   [51]  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6
   [76]  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7
  [101]  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7
  [126]  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7  7
  [151]  7  7  7  7  7  7  7  7  7  7  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
  [176]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
  [201]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
  [226]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
  [251]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
  [276]  8  8  8  8  8  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9
  [301] 10 10 10 10 10 10 10 10 10 10
 mean(x)
 ## 7.290323
 var(x)
 ## 1.307026
 table(x)
 x
   5   6   7   8   9  10 
  20  60  80 120  20  10 

enter image description here