How to read a histogram?

3.8k Views Asked by At

I posted the following question over on StackOverflow, and I think I am having a hard time understanding exactly how a histogram works.

I have the following set of numbers between 0 and 9:

   [1] 1 0 1 4 0 0 7 3 5 3 8 9 1 3 3 1 2 0 7 5 8 6 2 0 2 3 6 9 9 7 8 9 4 9 2 1 3
  [38] 1 1 4 9 1 4 4 2 6 3 7 7 4 7 5 1 9 0 2 2 3 9 1 1 1 5 0 6 3 4 8 1 0 3 9 6 2
  [75] 6 4 7 1 4 1 5 4 8 9 2 9 9 8 9 6 3 6 4 6 2 9 1 2 0 5 9 2 7 7 2 8 8 5 0 6 0
 [112] 0 2 9 0 4 7 7 1 5 7 9 4 6 1 5 7 6 5 0 4 8 7 6 1 8 7 3 7 3 1 0 3 4 5 4 0 5
 [149] 4 0 3 5 1 0 8 3 7 0 9 6 6 9 5 4 6 9 3 5 4 2 4 8 7 7 5 8 8 8 2 6 9 3 1 0 4
 [186] 1 5 9 0 6 2 1 3 0 6 0 0 8 3 2 0 0 6 0 0 4 7 2 7 1 9 9 3 9 8 4 6 6 5 3 8 1
 [223] 8 7 1 3 7 6 3 6 3 6 3 2 3 2 2 7 9 2 3 2 7 5 5 8 8 2 0 1 4 0 6 3 7 1 1 1 4
 [260] 7 0 2 9 2 0 5 6 0 8 9 6 2 0 0 7 2 0 4 2 0 9 1 6 9 3 0 0 2 0 6 8 4 0 7 2 1
 [297] 9 5 2 4 8 5 2 9 7 9 2 9 7 4 9 3 2 7 3 6 3 6 8 8 3 7 0 9 2 7 9 0 5 4 5 8 4
 [334] 3 3 1 7 8 9 7 6 2 1 7 0 5 6 5 2 9 5 4 6 2 2 2 9 0 7 7 2 2 6 3 4 2 0 5 9 6
 [371] 2 1 9 0 6 0 4 8 4 3 1 5 4 2 9 5 7 3 1 5 4 5 3 7 3 8 6 2 4 6 1 1 4 0 0 5 8
 [408] 6 7 4 2 8 0 2 5 4 8 3 0 6 4 8 6 4 1 8 1 5 4 9 4 3 2 0 5 0 7 9 2 9 8 9 6 5
 [445] 2 4 4 6 4 8 4 1 7 5 8 9 5 9 3 2 5 8 2 2 7 2 8 4 1 9 3 6 0 2 2 9 1 2 7 2 1
 [482] 3 4 9 1 8 0 2 2 3 4 1 3 7 4 1 4 1 5 9 6 9 0 5 7 6 8 2 0 7 3 5 8 2 8 2 4 8
 [519] 5 8 9 7 1 2 4 5 5 1 8 1 4 4 6 5 8 9 2 3 0 5 1 4 0 5 1 2 9 2 4 1 6 8 0 4 9
 [556] 0 0 5 9 2 3 5 9 4 4 3 9 2 3 5 6 5 2 7 2 4 2 4 7 2 5 3 7 6 1 0 7 5 4 5 1 6
 [593] 9 7 1 6 3 3 1 2 2 0 5 0 6 8 3 6 7 7 3 8 1 7 9 3 9 2 8 3 7 4 1 2 3 6 5 0 1
 [630] 8 6 9 2 1 6 0 2 8 0 8 8 9 1 2 2 1 4 8 1 4 4 5 1 8 7 7 9 7 0 6 9 4 5 6 2 5
 [667] 7 4 7 2 3 0 8 4 8 0 0 9 7 7 9 8 2 1 6 5 5 1 1 9 7 7 8 6 4 7 5 3 1 6 4 5 7
 [704] 4 1 8 3 5 1 7 1 1 8 6 4 3 8 3 1 2 8 9 0 9 1 2 3 3 0 3 0 2 0 3 3 8 3 5 7 0
 [741] 5 9 0 5 9 1 5 1 1 2 6 5 5 4 5 1 6 0 2 2 8 0 7 1 0 8 5 6 3 2 9 4 3 6 0 3 4
 [778] 1 5 9 3 0 5 0 6 2 7 6 6 6 9 6 7 8 2 0 6 0 8 9 5 3 6 7 4 3 9 7 2 0 4 7 2 2
 [815] 8 2 7 0 4 0 5 2 8 7 7 9 1 4 0 1 1 2 3 6 2 0 6 6 1 9 4 5 2 7 7 8 9 5 8 3 8
 [852] 5 6 2 0 9 7 1 8 2 6 9 8 4 9 4 1 3 8 4 0 7 7 3 7 6 6 8 8 2 7 0 4 3 7 7 0 8
 [889] 4 7 4 0 6 9 8 6 0 1 6 4 5 2 7 3 6 2 2 9 2 7 4 8 7 2 9 5 3 4 8 0 4 4 6 5 6
 [926] 1 2 2 8 4 5 7 8 0 6 8 9 1 7 7 2 6 3 9 9 1 0 4 2 5 4 4 9 2 6 7 2 8 3 3 2 7
 [963] 0 4 7 0 7 7 8 1 7 3 7 8 0 1 0 2 9 7 6 2 2 6 9 0 6 8 8 9 6 3 5 0 2 2 5 9 6
[1000] 4

Which produces the following histogram:

enter image description here

From what I have heard this is expected. I had assumed that a histogram would display the count of each occurrence of each number in my set. But this does not seem to be right, because 10 numbers occur in my set and there are only 9 columns in my histogram. Can someone explain what each of those 9 columns represents?

2

There are 2 best solutions below

0
On BEST ANSWER

Here are the counts of your data: (0, 107), (1, 96), (2, 124), (3, 90), (4, 102), (5, 89), (6, 97), (7, 105), (8, 93), (9, 97), where (0, 107) means that your data has 107 0's in it. It appears that whatever program you used to generate the histogram combined the counts of 0 and 1; notice that your leftmost bar has a height of just over 200 and that 107 + 96 = 203. The second bar corresponds to 2, the third bar to 3, etc.

1
On

A histogram is for continuous data. You are using it for discrete data and then wanting to see the discrete structure. Use a barplot instead.

Because a histogram is for continuous data, it divides the x-axis into equal sized intervals and counts up the number of observations in each interval, with intervals open on the left and closed on the right. So the intervals are $[0,1+\epsilon)$, $[1+\epsilon,2+2\epsilon)$, etc. for some small $\epsilon$.