Comment on the plots of two fitted densities on a histogram

35 Views Asked by At

What possible comments can I draw on this following plot? It contains plot of two fitted densities. One estimating the parameters using MLE and other using MME, that I calculated from a set of data following gamma distribution. Will "both method gives almost the same plot." be enough as description? enter image description here

1

There are 1 best solutions below

0
On

Comment continued:

When the sample size is large, any consistent estimator should help to estimate the density of the population.

Below we use R to sample $n = 100\,000$ observations at random from $\mathsf{Gamma}(\mathrm{shape} = \alpha = 4, \mathrm{rate}= \lambda = 0.1),$ For this distribution, $\mu = \alpha/\lambda$ and $\sigma^2 = \alpha/\lambda^2.$ Thus the MLEs are $\tilde \alpha = \bar X^2/S^2$ and $\tilde\lambda = \bar X/ S^2.$

In particular, for our sample $\tilde \alpha = 3.9845,$ which is very close to the population value $\alpha = 4$ and $\tilde \lambda = 0.09938,$ which is very close to $\lambda - 0.1.$

set.seed(519)
x = rgamma(10^5, 4, .1)
mean(x);  var(x)
[1] 40.09246
[1] 403.4163
shape=mean(x)^2/var(x); shape
[1] 3.984484
rate = mean(x)/var(x);  rate
0.09938236

Therefore, we should expect that the density function (solid orange curve) for $\mathsf{Gamma}(\hat \alpha, \hat \lambda)$ should fit the histogram of the data quite well, which it does. [The fit would look better if I had used more bins to make the histogram, but then the tops of the bars would obscure the density curve.]

However, the default KDE in R from the procedure density (dashed purple) is almost identical to the density curve based on MMEs, within the resolution of the graph.

hist(x, prob=T, col="skyblue2")
curve(dgamma(x, shape, rate), add=T, col="orange")
lines(density(x), lwd=2, col="purple", lty="dashed") 

enter image description here

This demonstration is not to denigrate MMEs (much MLEs), but to show that KDEs based on large samples can provide extremely good estimates of density functions.


Note: With only $n = 500$ observations, we have $\tilde\alpha = 3.93, \tilde \lambda = 0.097,$ and the plot is as below. The dotted black like is for the true population density (known here only because this is a simulation experiment). Tick marks along the horizontal axis show locations of data values.

The density based on MMEs and the KDE are presented as in the plot above. For samples of small or moderate size, as here, the KDE is often nearer to the shape of the histogram and the curve based on parameter estimation is often nearer to the actual population density.

enter image description here