This is my first question here. Please understand even if my question is not very clear.
I have tried to calculate skewness and kurtosis directly from probability density function (PDF) without knowing the original data.
I have many data sets and I have made PDFs from these data set and I averaged these into one PDF.
My purpose is to find the skewness and kurtosis of this averaged PDF. Actually I have tried this with computational language of Python. However, I realized that this is mathmatical problem rather than programing language problem.
I know that it may be very difficult or impossible to get the moments without original data set.
Is there any possible solution for this issue such as back calculation from PDF to original data?
Any idea or help would be really appreciated.
Thank you,
Hoonill
Without a closed form pdf, you can calculate the skewness from your data alone. (Without even bothering with an estimation of the pdf which will likely lead to high level of error).
http://en.wikipedia.org/wiki/Skewness#Sample_skewness
Using a method of moments estimator of skewness you can calculate:
$\hat\gamma = \frac{(\frac{1}{n}*\sum_{k=1}^n(x_i - \overline{x})^3)}{[\frac{1}{n-1}*\sum_{k=1}^n(x_i - \overline{x})]^{\frac{3}{2}}}$
Where $\overline{x}$ is your sample mean.
To answer your this other question of yours:
You could generate a large sample from your newly calculated PDF, and calculate the skewness using the metric above, but you would be adding a lot of variability and error.
Similarly, for the kurtosis, you can use the following statistic
http://en.wikipedia.org/wiki/Kurtosis#Sample_kurtosis
$\hat\beta = \frac{(\frac{1}{n}*\sum_{k=1}^n(x_i - \overline{x})^4)}{[\frac{1}{n-1}*\sum_{k=1}^n(x_i - \overline{x})]^{2}} - 3$
In order to simulate a sample, you need to take your empirical pdf and convert it to a cdf. Once you do that, you will want to create an inverse of your cdf.
If you use python to make a very large sample (the larger the better) of Uniform Random variables from $0$ to $1$, and input them into your inverse CDF, you will have a generated a random sample from your empirical PDF.
From there you can use the statistics above.
Hope that helps