Comparing fuel efficiency using R

86 Views Asked by At

I am given two data sets.

Small Cars

3 | 3,3,3,3,4,4,4,4,4,4,4,5,6,7,7,7 
4 | 5
5 | 0

Sporty Cars

2 | 9,9,9,9,9,9,9
3 | 0,0,0,0,0,0,0,1,1,1,1,1,2,2,2,2,3

I am then asked to find the mean but I am not sure how to do that in R. I get to the point where I get the values on a table where the columns are MPG, Small, Sporty. But not sure what to do afterwards. And I am new to R so it is a bit harder.

3

There are 3 best solutions below

4
On

You can use the mean() function on a numeric vector to return the mean. If your data table is called "df", and the column you want to find the mean for is called "MPG", then you can use mean(df[["MPG"]]) to get the mean for the column MPG.

0
On

Some examples to help you to start.

data(mtcars) #load a data set, mtcars is the name of the dataframe.

mean(mtcars$mpg) #compute the mean for mpg features for the dataframe mtcars

0
On

Very elementary, very short lesson in R:

After the prompts <in the R session window, type what I have below. Do the same for 'Sporty' cars. Note the c at the beginning of the data, the commas after each observation and the parentheses. You could make a data frame as shown in another answer. That is a good thing to learn, especially if you are given a data file with the observations. But for printed (rather than digital) data, you have to type the observations anyway, so what I'm showing you is easier.

> Small = c(50, 45, 37, 37, 37, 36, 35, 34, 34, 34, 34, 34, 34, 34, 33, 33, 33, 33)
> mean(Small)
[1] 35.94444
> summary(Small)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  33.00   34.00   34.00   35.94   36.75   50.00 
> sd(Small)
[1] 4.504537
> boxplot(Small, col="skyblue", pch=19)

Notice that the median and the lower quartile are the same, so the median 'cross bar'; coincides with the lower end of the 'box' in the boxplot. (Try boxplot(Small), leaving out parameters 'col' and 'pch'; you'll get a slightly less fancy boxplot.)

enter image description here

You could make a data frame as shown in the answer of @SiongthyGoth. That is a good thing to learn, especially if you are given a data file with the observations. But for printed (rather than digital) data, you have to type the observations anyway, so what I'm showing you is easier.

R is a very useful (and free!) statistical software package. You will go crazy if you try to learn it all at once. Take one small step, like this one, at a time and you will be happy using R.


More (when ready):

> Sporty = c(33,32,32,32,32,31,31,31,31,31,31,30,30,30,30,30,30,30,29,29,29,29,29,29,29)
> mean(Sporty)
[1] 30.4
> All = c(Small, Sporty)
> mean(All)
[1] 32.72093
> length(Small)
[1] 18
> length(Sporty)
[1] 25
> Type = c(rep(1, 18), rep(2,25))
> boxplot(All ~ Type, col="wheat", pch=19)

enter image description here

> t.test(Small, Sporty)

        Welch Two Sample t-test

data:  Small and Sporty 
t = 5.0956, df = 18.719, p-value = 6.717e-05
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 3.264730 7.824159 
sample estimates:
mean of x mean of y 
 35.94444  30.40000 

Because of the very small P-value, we conclude that MPGs are significantly different for the two types of cars.

> stripchart(All ~ Type, meth="stack", pch=20)

enter image description here