How to calculate average rate from a dataset

91 Views Asked by At

I have a large dataset of "trips", each row is a different trip. I want to know the average number of miles per day for a group of trips.

Example data:

Trip 1: 500 miles in 30 days (16.7mi/day)

Trip 2: 200 miles in 10 days (20mi/day)

Trip 3: 30 miles in 3 days (10mi/day)

Trip 4: 50 miles in 6 days (8.3mi/day)

I calculated the average 2 different ways:

  1. Created a calculated column "miles per day" for each trip, shown in the example in parentheses. Then calculated AVG(miles per day) -> (16.67 + 20 + 10 + 8.3)/4 = 13.8mi/day
  2. (SUM(miles))/(SUM(days)) -> (500+200+30+50miles) / (30+10+3+6days) = 15.9mi/day

With the dataset I'm using, the 2 methods always give me different results. I am wondering if anyone can explain which method would be more reflective of the "true" average miles per day for a group of trips, and why that method is preferred. (Note: for the purpose of this dataset, it is possible that some trips had a negative "miles per day value").

1

There are 1 best solutions below

0
On

Imagine another scenario... you have two bags. In the first bag, you have a single \$100 bill as well as a single \$1 bill. In the second bag you have a hundred \$1 bills. You could ask the question... "if I were to give you a bag at random, and then you pull out a bill at random from that bag... what is the expected amount of money that you would pull out?" That would be answered like your first approach here.

You could also ask "if I were to combine the bags, and then you pull out a bill at random from the combined bags... what is the expected amount of money you would pull out?" That would be answered like your second approach here.

Both questions are fine questions to ask... and the questions have different answers and the answers are calculated differently.

The fundamental difference here is again whether you want the trips to have equal weight despite being different lengths, or if you want each individual day to have equal weight.