how do they calculate these following columns

52 Views Asked by At

I have these data:

enter image description here

I am sorry the data is in Portuguese, and it is an image so I can't convert it to a table but the translate "probably" ( i am not a native speakers for Portuguese language) is:

  1. The first column is the minute that the cars have entered to my garage.
  2. the second column is the distinct minutes
  3. the third column is multiplying the distince minutes by the number of cars.

My question

how do they calculate the forth and fifth column

3

There are 3 best solutions below

0
On BEST ANSWER

Without further (one might say detailed) explanation, one is entitled to expect the first graph to a relative frequency histogram (or bar chart) and the second to be an empirical cumulative distribution function (based on cumulative relative frequencies). I understand the rationales in some of the other answers, but they do not correspond to any standard statistical graphs. Notice that the number of entering cars per minute is given, but the number of minutes is not, so we have no information about actual numbers of cars. We also do not know whether monitoring was continuous or occasional. Also, certainly the spacing along the horizontal axis should reflect reality as noted by @Henry.

Here are graphs I made using R, which are standard statistical graphs for discrete data and which will probably be less mysterious to the general public than those shown in the question. (The method of construction should be clear; the R code is provided for completeness.) The mean of the 11 observations is 15.45, as noted.

However, even with these graphs, the goal of this study of entry rates of cars into a garage remains elusive. Is the concern about a traffic jam at the entrance? Or to show that the entrance rate is sporadic? Maybe showing entry rates at different times of day would be more interesting. Are these data important in their own right, or have they been dragged without purpose into a 'demonstration' on making tables and graphs?

enter image description here

 f = c(1,1,2,5,2);  rf = f/sum(f);  crf = cumsum(rf)
 v = c(10,11, 12, 17, 20)

 par(mfrow=c(1,2)) # 2 panels per page
 plot(v, rf, ylim=c(0, max(rf)), lwd=5, type="h",  xlim=c(8,22),
      ylab="Relative Frequency", xlab="Cars/min.", main="")
   abline(h = 0, col="darkgreen")
 plot (c(8,v, 22), c(0,crf, 1), ylim=c(0, 1), lwd = 2, type="s",
      ylab="Cumulative RF", xlab = "Cars/min.", main="")
   abline(h=0:1, col="darkgreen")
   points(v, crf, pch=19)
 par(mfrow=c(1,1))  # return to default single panel plotting

 sum(v*rf)
 ## 15.45455
 sum(f)
 ## 11
0
On

I would say:

forth column = third column / 170 * 100 (e.g. 2*20 / 170 * 100 = 23.53)

fifth column:

5.88 + 0 = 5.88

5.88 + 6.47 = 12.35

12.35 + 14.12 = 26.47

etc.

0
On

This is what I make of the columns:

  1. The second column is just a listing (in ascending order) of all distinct values in the first column.

  2. The third tells you how often each distinct value occurs and how many cars this value contributes to the total. For example, by column 1 there are 2 minutes in which $12$ cars enter the garage/shop, i.e. the contribution of the value $12$ to the the total number of cars is $2 \cdot 12 = 24$.

  3. The fourth column shows relative contribution (i.e. percentage) of a certain value in the first column to the number of total cars. For the value $10$ it is $10/170 \approx 0.58824$, for 17 it is $5*17/170 = 0.5$ and so on.

  4. The fifth column is the cumulative percentage, i.e. adding up all the percentages up to a certain value. For the value $17$ you add the percentages of $10,11,12,17$.