How to summarize a big table of results? Average or Geometric mean?

Question

How to summarize a big table of results? Average or Geometric mean?

239 Views Asked by Bumbble Comm At 30 Mar 2026 - 3:55

I am writing a paper for a Computer Science conference and I have a big (way too big) table of results (times and some other measures) for different versions of an algorithm. I would like to summarize it using some kind of average, but using the arithmetic average does not seem to represent the results (because for some benchmarks times are low, 0.01s, for other they are huge 14000s) because of the combinatorial nature of the problem I'm solving.

Is it correct to use the geometric mean in this case?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Without seeing your data, it seems to me you might think they are lognormal (that means they would be normally distributed if you took their logs).

You might try taking taking logs of a few of your samples and making histograms to see if you get something like a normal shape. A sample of several hundred would be better than several dozen because it takes a fairly large sample from a normal distribution to give a relatively smooth 'normal looking' histogram.

If you take logs of you data, then the arithmetic mean, and then exponentiate that (raise $e$ to that power), you have a geometric mean of the original un-logged data. (If that is not obvious, take a look at the intro to the Wikipedia article on 'geometric mean'.)

If arithmetic means of logged data seem useful summaries, you might consider that as a method of presentation. I don't know the mathematical level of your audience, but it is usually an easier 'sell' to say you took logs of data and then do arithmetic means than to try to explain geometric means.

The Richter, and other magnitude scales, for earthquakes are already logged. Economists often used logged data, etc. So taking logs of data does not seem inherently strange to people with even a little experience with a social or physical science. By contrast, I have gotten a lot of strange looks at the very mention of geometric means. Maybe your audience will take them in stride.

It would be nice to have a rationale other than whim or intuition for whatever kind of mean you use. The basic point is that if your data are roughly lognormal, then that is already a good argument for using geometric means of unlogged data--or for taking logs and using arithmetic means.

Based on what I IMAGINE your data look like (from on your description), I would not urge you to summarize them using medians. Granted that's a hunch, but not one without a bit of experience with real data behind it.

Addendum: Prompted by OP's Comment, here are 'typical' histograms one might get from samples of sizes 50, 100, 500, and 1000 from a normal population (perhaps by logging data of the type described in the Question). The message is that histograms tend to fit the normal distribution curve of the population with greater success as the sample size increases. (All samples are simulated from a normal distribution with mean 100 and standard deviation 15. So, as here, almost all observations should lie within 100 $\pm$ 3(15); that is, between 55 and 145.)

How to summarize a big table of results? Average or Geometric mean?

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in AVERAGE

Related Questions in MEANS

Trending Questions

Popular # Hahtags

Popular Questions