Normally distribute dataset

33 Views Asked by At

I have the following dataset of values 0 to 318 that I'm looking to transform into a normal distribution curve. The data currently represents an inverse curve. What is the best technique to normalize this data?

Sample dataset:

{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,100,118,120,145,200,231,318}
1

There are 1 best solutions below

0
On

Comment (continued): Messing about in R statistical software with the lognormal idea, pending solid information.

x=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
    100,118,120,145,200,231,318)
y = log(x);  a = mean(y);  s = sd(y);  a;  s
## 2.908318
## 1.433811
par(mfrow=c(2,1))
  hist(y, prob=T, col="skyblue2");  rug(y)
     curve(dnorm(x,a,s), lwd=2, col="maroon", add=T)
  qqnorm(y)
par(mfrow=c(1,1))

enter image description here

There is a huge gap in the original data between 23 and 100. With so few observations, it is unrealistic to get a good fit to a normal density. But the normal Q-Q plot is strange. Have you given me a partial data set, and left out the $\dots ?$

Various Q-Q plots from Minitab 17 (c11 has logged data, Q-Q plot axes reversed from those of R):

enter image description here