Given a picture bar chart with no other data, means or standard deviation. If one bar is higher than the rest can you prove that it is statistically significant?
What kind of test would you perform?
Given a picture bar chart with no other data, means or standard deviation. If one bar is higher than the rest can you prove that it is statistically significant?
What kind of test would you perform?
Copyright © 2021 JogjaFile Inc.
Bar charts that give only proportions or percentages for various categories are of almost no value in deciding whether categories are equally likely. One must know the counts in each category (or be able to reconstruct counts from the total sample size) in order to test whether categories are equally likely.
'Error bars' on the bars might sometimes be useful, but there seems to be no standard way to make them--much less consistently useful information how to interpret error bars. Nothing is better than for a bar plot to show counts along with total sample size. Then rely on a formal statistical test to assess equality of categories.
Examples: Suppose I am rolling a die to see if it is fair. After some exploratory rolls I get percentages for each face as follows, and make a bar chart showing the counts.
The bar for
5's stands out as being higher than the rest.However, this is not necessarily 'proof' of an unfair die. Suppose, the percentage results are based on counts below for only $n=50$ rolls of the die.
A test for equal proportions in R shows P-value $0.37 > 0.05 = 5\%,$ which is far from a statistically significant result. It would be wrong to declare the die unfair.
By contrast, suppose I roll the die $n=600$ times and get the following counts.
Percentages rounded to one decimal place are as shown below,
A bar chart of frequencies may look somewhat similar to the one above in its 'irregularities'.
However, with 600 rolls we have a lot more information. For these counts,
prop.testgives a P-value near $0,$ indicating that it would be almost impossible for a truly fair die to give such counts.Note: Here is the R code for the 50 rolls of a fair die and the 600 rolls of an unfair die used as examples above: