Given are particle sizes and corresponding percentages of the total volume. Data:
size vol%
0.594 0.03
0.675 0.11
0.872 0.25
0.991 0.34
1.28 0.55
1.45 0.65
1.88 0.89
2.13 1.04
2.75 1.44
3.12 1.66
4.03 2.16
4.58 2.43
5.92 3.03
6.72 3.35
8.68 4.05
9.86 4.41
12.7 5.09
14.5 5.38
18.7 5.7
21.2 5.67
27.4 5.15
31.1 4.66
40.1 3.38
45.6 2.68
58.9 1.42
66.9 0.92
86.4 0.18
98.1 0.03
Plot using Excel:
Reading up on particle size distribution on WP, I found that this kind of data usually follow lognormal or Weibull-distributions. So I followed some YT tutorials on checking whether that is the case and arrived at this:
So it's not perfect (2nd is Weibull) but I want to follow through with it if possible. However, when I extract the $\mu$ and $\sigma$ from the equation of the regression line, I don't get the original distribution (plot using WA):
Is my approach/are $\mu$ and $\sigma$ correct?
Could I just take $\frac{1}{n}\sum_1^n x\cdot f(x)$ for $\mu$, and from that then calculate $\sigma$ as I would for a discrete distribution using $\sqrt{(\frac{1}{n}\sum_1^n (\mu-x)^2)}$?





It's difficult to say whether you are doing the estimation correctly based on what you have provided. Lognormal model fitting is very sensitive to outliers. The usual approach for model fitting is to transform the data to the log scale, then fit a normal distribution to the transformed data. Then the values for $\mu$ and $\sigma$ are then used to calculate the lognormal mean and variance. If $Y = e^X$ is lognormal where $X$ is normal with mean $\mu$ and variance $\sigma^2$, then
$$\operatorname{E}[Y] = e^{\mu + \sigma^2/2}, \quad \operatorname{Var}[Y] = (e^{\sigma^2} - 1)e^{2 \mu + \sigma^2}.$$
The fit based on a small amount of data is likely to be very poor. Based on the regression plot you have included, I would say that the data is not lognormally distributed. If you include a Weibull plot, I would be able to see whether the Weibull fit is actually better.