Searching an Approximation Formula for Two Parameters

110 Views Asked by At

I have an algorithm with two parameters ($p_1$ and $p_2$) and one result ($x$). Interesting (for me) parameters and results are:

p1   p2     x
2     2     1.998836002
2    16     4.610859226
2   128    13.500098994
2  1024    40.130006757
2  8192   106.270345052
2 65536   259.381852150
3     3     3.992318015
3    24    24.722478220
3   192   207.129843032
3  1536  1509.379356327
3 12288 10150.661621094
4     4     8.319227723
4    32   141.460142017
4   256  3425.946696232
5     5    18.072054640
5    40   856.749884642
6     6    40.715789568
6    48  5200.304026741
7     7    93.505690907
8     8   221.173052610
9     9   527.337735904

How do I find an approximation formula to calculate $x$ from $p_1$ and $p_2$? Please note it doesn't need to be (actually can't be) fully accurate, because $x$ is already an approximation due to the algorithm used:

The algorithm is searching (using brute force) a hash function that can split a set of entries into exactly $n$ subsets of equal size. To do that, it tries to hash each entry of the set using an index. The probability to find an index in one step, that is to split a set into $m$ subsets of equal size $n$, is

$$ p = \frac{(mn)!}{(n!)^m m^{mn}} $$

Or we use the Stirling approximation:

$$ p = \frac{\sqrt{m}}{{(2 \pi n)} ^ \frac{(m - 1)}{2}} $$

This process is repeated until an index is found. The result ($x$) is the number of calls to the universal hash function, which is equivalent to the time it took to find a solution. So I'm trying to estimate how long it will take to split a set.

For the case where $p_1 = p_2$, I do have an approximation formula:

$$ \Big( \frac{-1}{\log{(1-p)}} \Big) ^{1.0457} \frac{3.267}{p_1} $$

1

There are 1 best solutions below

2
On BEST ANSWER

Using the ideas from Jaume Oliver Lafont (see comment above):

Some crude observations: for $p_1=2$, $x$ is not far from $\sqrt{2}$, for $p_1=3$, $x$ and $p_2$ are close, for $p_1$ 7, 8 and 9, $x$ is about $2^{p_2}$. Would linear regression on $p_1$, $log(p_2)$ and $log(x)$ make any sense?

I analysed how $x$ behaves for $p_1=p_2$. Using the online tool MyCurveFit, I found the approximation $2.37^{p_1-1.4}$. Then I found that $log(p_2)$ and $log(x)$ (if $p_1$ is the same) are related linearly. Again using MyCurveFit, I found the $m$ for that relation is about $0.34+\frac{7}{{p_1}^{2.1}}$.

Finally, I got a formula that I could simplify using Wolfram|Alpha. My approximation is now:

$0.298781 (2.37^{p_1}) (\sqrt[0.34+{p_1}^{2.1}](\frac{p_2}{p_1}))$