Standard Deviation of a Set of Data

160 Views Asked by At

When looking to set up a graph of standard deviation, most people nowadays just use excel, but I was wondering is there a way to manually calculate the distribution curve.

For clarification, I know how to calculate the Standard Deviation, but If I wanted to draw out the bell shaped curve corresponding to it, how would I do so?

1

There are 1 best solutions below

0
On

I'm not exactly sure what you're looking for, but this might help. Suppose I generate 500 fake observations from $\mathsf{Norm}(\mu = 100, \sigma=15)$ and round to one decimal place. In R statistical software it might look like this.

set.seed(1234)  # use same 'set.seed' statement to get same data
x = round(rnorm(500, 100, 15))
a = mean(x);  s = sd(x);  a;  s
## 100.016
## 15.53326

The sample mean is $\bar X = 100.02$ (not a bad estimate of $\mu$) and the sample standard deviation is $S = 15.53$ (not a bad estimate of $\sigma$). Without a calculator or software getting $\bar X$ and $S$ would be a bit tedious. (I did not print out the 500 data values because they would take a lot of space here.)

Then you can use the equation of the density function $f(x)$ of $\mathsf{Norm}(\hat\mu = 100.02,\, \hat\sigma = 15.53),$ with sample estimates taking the place of the unknown population parameters, to find a few points of $$f(x|\hat\mu,\hat\sigma) = \frac{1}{\hat\sigma\sqrt{2\pi}}\exp\left[-\frac{1}{2}\left(\frac{x-\hat\mu}{\hat\sigma}\right)^2\right].$$ Perhaps the $x$-values would be something like $x = 60, 80, 100, 120, 140$. Then connect the dots with a smooth curve that should look a lot like a normal density curve.

In the figure below, I used R to make a histogram for reference and show the individual data values as tick marks below the axis (some of the double-plotted because of rounding to integers). Then I plotted the density function of $\mathsf{Norm}(\hat\mu = 100.02, \hat\sigma = 15.53)$ on top of the histogram. (The suggested plotting points are shown as heavy dots.)

enter image description here

With only $n = 500$ points, you can't expect the approximating normal curve to match the histogram closely. That's why you get a better normal curve by getting several plotting points from the equation of the density curve.

So to answer to your question: it is possible to plot the curve by hand as you suggest. Before the computer age people often did just that. But it is a bit of work and nowadays most people prefer to use software. Even so, it is a good idea to understand what the software is doing.

In case you are interested in the R code used to make the figure, I'm including it below: function rnorm generates normal data, dnorm is the normal density function.

 hist(x, prob=T, col="skyblue2");  rug(x)
 curve(dnorm(x, 100.02, 15.53), lwd=2, col="red", add=T)
 xx = c(60, 80, 100, 120, 140)
 yy = dnorm(xx, 100.02, 15.53);  round(yy, 3)
 0.001 0.011 0.026 0.011 0.001
 points(xx, yy, pch=19)