in an experiment, I ask participants to rate qualities on a continuous scale. I expect the results to be normal distributed and I am confident that assuming a normal works fairly well for most values. Nevertheless, I wonder if it is necessary to assume a skew normal for very high and very minimal ratings on the finite scales. Here are few questions:
- Do you think it makes sense in practice to assume a skew normal? What are the benefits/drawbacks of this here? I require nothing but the expected value.
- How to calculate the parameters of a skew normal, given the example? What is meant by "there is no closed form expression" in the corresponding Wikipedia-article (http://en.wikipedia.org/wiki/Skew_normal_distribution)?
- Which test is most suitable to estimate whether the distribution follows a skew normal, including an ordinary normal distribution?
Thank you very much for your kind assistance!
Christian
"No closed-form expression for the estimates": it just means that there does not exist an expression where you have the unknown parameter on the left-hand side, and a function of only the data on the right-hand side. Nothing unusual: we estimate skew normals using standard iterative maximum-likelihood, which is provided by all software packages that I know of.
Test for distributional assumption: Thankfully, the distinguishing characteristic of a skew-normal is its skeweness, which is not easily mistaken in the data. Just plotting an empirical frequency distribution, or a kernel density estimate, of your data will give you an idea of whether there is detectable skew in them. Also, if you implement iterative maximum likelihood and the algorithm does not converge or you get a message like "Hessian close to singular", it is an indication that the skew parameter may be very close to zero, in which case near-singularity emerges. But then this is an indication that you may not need a skewed distribution after all.
Benefits and drawbacks: the skew normal is an impressively flexible distribution, given that it requires just three parameters to achieve that. In the last 25 years it has generated a flood of literature, and it is very well studied, and lots of its properties has been explored and derived, as well as many variants and compositions. It inherits various convenient properties from the normal distribution while at the same time has a wider reach, in terms of ability to represent diversified data. The drawbacks I can think of is 1) the above-mentioned about the difficulty in estimating reliably the skew parameter when it is very close to zero 2) That the degree of representable skewness is limited in the range $(-1,1)$. So one could argue that the skew-normal is not suitable for data really close to normal, or, at the other extreme, for too skewed data.
To make up your mind though, you should search the literature - the wiki article is very limited, although correct.