Reporting numbers that vary

Question

Reporting numbers that vary

30 Views Asked by Bumbble Comm At 31 Mar 2026 - 2:33

I'm timing how long various parts of a computer program takes to run, with the intent of including this in a research paper.

There are fluctuations in the timing based on the usual things inside a computer (other processes running, scheduling, timer not being 100% accurate, etc).

I'd like to report a value for how long each section takes, but since each section can vary by some amount, I'm not really sure what I should report.

I could for instance time it over 10 runs and report the average but maybe that isn't enough samples, or maybe a strong outlier would move the average too much.

I was thinking maybe I could use the arithmetic mean, which is good for getting rid of large outliers, and seems possibly appropriate for my usage case where random things will only slow things down, not ever make it faster.

Still even with using arithmetic mean, I'm not sure how many samples I ought to take or how i ought to report the values. Should it be a single value? or a range with some sort of confidence interval?

Thanks!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 24 Mar 2016 - 10:10

These are just some thoughts I have. I am sorry that they are just a long list of questions and no answers:

What is the function of your run time data in the report? Comparing parts of your program, comparing to some reference algorithms?
Could you measure it differently to get more stable figures? Sometimes one counts the number of times a specific time consuming operation is repeated in stead of the actual run times.
How important is it that your data converges on very stable figures? Does your point come through with less effort than 20+ samples?
How should the scientific community regard your data? Should they be able to reproduce it to high accuracy or is it just to show a general tendency where others may or may not go into more details about the specific run times in a new study?
As long as you point out which statistical methods you are using and to what end, then others may want to try a different approach in a new study, but they will know exactly what you did and why.
The more your data varies, the more you may need to increase the number of trials in order to obtain a stable figure. You can estimate the standard deviation of the distribution behind your data using the well known formula for that.

The more specific you get about what it is your data should be used for, the better we can discuss how to set up the task of collecting samples and process the sampling data.

**Bumbble Comm** · Accepted Answer

One suggestion: For each section, take multiple samples (at least 15, even 30 or more if you can afford it) and plot a histogram of the distribution of run times. With a histogram the reader can get an overall sense of how everything is distributed, and can gauge the value of a typical run time as well as the likely size of the extreme values.

In addition you can report the (arithmetic) mean and standard deviation of those run times as numerical summaries of the data. If you're concerned about outliers, you can report the median instead of the mean, and report the 25th and 75th percentiles instead of the standard deviation. (Note that the arithmetic mean is affected by outliers; the median is not, since it cares only about the middle values in a distribution.)

Reporting numbers that vary

There are 2 best solutions below

Related Questions in AVERAGE

Trending Questions

Popular # Hahtags

Popular Questions