I have a program which I originally thought was linear, meaning that if I had twice the data, the program would take twice as long to process.
I was wrong, and it seems to be somewhat exponential.
Here are my results
300 words (1 folder) = .9 sec
600 words (2 fld) = 2.1 sec
900 words (3 fld) = 3.8 sec
1200 words (4 fld) = 6.6 sec
1500 words (5 fld) = 9.7 sec
3000 words (10 fld) = 37.5 sec
I have all my words in a folder, and I tested the program with multiple copies of the same folder. Keep in mind this is a real world test, so the numbers may not be exactly as they "should be."
I would like to find an algorithm to be able to estimate how long the program will take based on how many words the user inputs.
My estimation for 30,000 words (100 fld
) was 1407 sec
which I arrived at from dividing
my time from 3,000 words (37.5 sec
) with my time from 300 words (.9 sec
) which was 41.7
.
I then multiplied 41.7
with 37.5
, and divided the result by .9
which gave me 1407
.
Perhaps I was right with this estimation?
Considering your data$$\left( \begin{array}{cc} 300 & 0.9 \\ 600 & 2.1 \\ 900 & 3.8 \\ 1200 & 6.6 \\ 1500 & 9.7 \\ 3000 & 37.5 \end{array} \right)$$
as you noticed from a plot, they are nonlinear but look either to be quadratic or exponential.
Assuming that time is $0$ for $0$ words, we can fit the model $$t=a w + b w^2$$ (linear regression) or $$t=a(e^{bx}-1)$$ (nonlinear regression)
For the first case, we get $$t=3.900822 \times 10^{-6} w^2+0.000781384 w$$ to which corresponds an $R^2=0.999859$.
For the second case, we get $$t=4.49706 \left(e^{0.000745006 w}-1\right)$$ to which corresponds an $R^2=0.999553$.
Both fits are excellent for the covered range. But the problem now is to extrapolate for values of $w$ much larger than the upper value of $3000$. For $w=30000$, the first model would give $t=3534.18$ while the second would give $t=2.28817\times 10^{10}$ (!!) which does not seem to be very realistic.
So, stay with the first model.
Edit
Simpler than the quadratic model would be $$t=8.79388 \times 10^{-6} w^{1.90657}$$ to which corresponds an $R^2=0.999737$. For $w=30000$, this last model would give $t=3020.87$.