Estimation of many jobs running concurrently given distribution?

16 Views Asked by At

In our system, we are running jobs which are queued with a distribution determined by analyzing empirical data, sample below:

Time between consecutive jobs (s) How many times this happened
0 257
1 374
2 127
3 85
4 73
5 66
6 65
7 63
8 73
9 52
10 60
... ...

This is just an excerpt because the time between consecutive jobs goes up to into the millions, like weeks between jobs!

If we know how long a job takes, say within a narrow margin always 600 seconds, is the entire data set, including all data points, enough to determine how many jobs might be expected to be running concurrently? Ideally with a standard deviation?