Skewness in a graph of data

1.4k Views Asked by At

It seems counter-intuitive to me that a graph that is skewed right actually has more data values on the left side (and vice-versa), like in this image:

enter image description here

I don't get how the term "skewed right" appropriately fits such a graph, because clearly it should be considered biased towards the left due to the higher number of data points on that side, and hence skewed towards the left instead.

Why is such a graph instead considered to be skewed towards to the right, or in other words, what is being skewed towards the right to give it this name.

2

There are 2 best solutions below

0
On BEST ANSWER

Intuitively, 'skewness' is defined so that it is in the the direction of the extreme tail, if there is one. (The technical reason is mentioned in the second comment--and just this minute--Answer by @muaddib.) If the distribution has a hump or 'mode', that is usually in the opposite direction from the skewness.

Statistics often uses ordinary English words and gives them special technical meanings. Whenever you see a word in bold or colored type in a statistics book, you should take care to notice its exact technical meaning.

In plain English 'skewed' has meanings like 'turned or viewed from the side.' In geometry 'skewed' has a different technical definition than in plain English and totally different from the one in statistics.

In your question, you used the word 'bias' in its ordinary English sense. In statistics 'bias' also has a special technical meaning that you will probably learn later.

A couple of other English words, among many, with special technical meanings in statistics are 'correlated' and 'independent'.

1
On

It's an often made mistake that the median is to the left of the mean in a skewed right distribution. This page amstat.org/publications/jse/v13n2/vonhippel.html mentions counterexamples including the poisson distribution for small λ.

The definition involves the quantity $E[(X−\mu)^3]$. If that value is positive, it is saying something like the mass is further off on the RHS. The $X^3$ emphasizes larger differences of distance. In your above picture, note the fat tail on the RHS. On the LHS most of the mass is concentrated at small values to the left of the mean.

I might be helpful to ask: If $X$ is discrete, with several outcomes, and you hold the mean fixed, what can you do to the locations/sizes of the outcomes to increase the skew.