What code from other programming languages can precisely approximate $D$?

451 Views Asked by At

Consider set $\left\{\left. T(c,d)\right|c,d\in\mathbb{Z}\right\}$, a subset of rational numbers that is defined in the form of a double variable function. For example, $T$ could equal $\left\{\left.\frac{2c+1}{2d}\right|c,d\in\mathbb{Z}\right\}$.

I want to find a the most and efficient code from a programming language that can numerically approximate $D$

$$D=\lim_{(a,b)\to(-\infty,\infty)}\left(\lim_{(r,t)\to(\infty,\infty)}D'(a,b,r,t)\right) \quad \quad \quad \quad \quad \quad \quad (1)$$

$$D'(a,b,r,t)=\frac{\sum\limits_{\left\{k\in\mathbb{Z}\right\}\cap[0,r]}\sum\limits_{\left\{n\in\text{odd}\right\}\cap[0,t]}\left|T\cap V(a,b,k,n)\right|}{\sum\limits_{\left\{k\in\mathbb{Z}\right\}\cap[0,r]}\sum\limits_{\left\{n\in\text{odd}\right\}\cap[0,t]}\left|V(a,b,k,n)\right|} \quad \quad \ \ \ (2)$$

Where $V(a,b,k,n)=\left\{\left.\frac{m}{2^k n}\in[a,b]\right|m\in\mathbb{Z} \land \gcd\left(m,2^kn\right)=1\right\}$.

Solving $D$ explicitly is impossible to do with programming. Approximating $D$ would be simpler; however, we must view $D$ in a slightly different way.

The method to approximate $D$ should go along these lines:

For fixed values of real numbers $a,b$, we take the limit of $D'(a,b,r,t)$ as $(r,t)\to\infty$. As $a$ is fixed to a large negative number and $b$ is fixed to a large positive number $D'(a,b,r,t)$ should approximate closer to $D$ if a limit exists. (Note that I am not focusing on how $a$ varies with $b$ or vice versa as $D$ is heavily dependent on $r$ and $t$.)

To approximate $D$ in this manner, I used Mathematica where I changed $D'(a,b,r,t)$ into a different form.

$$\frac{\left|T\cap\left(\bigcup\limits_{\left\{k\in\mathbb{Z}\right\}\cap[0,r]}\bigcup\limits_{\left\{n\in\text{odd}\right\}\cap[0,t]} V(a,b,k,n)\right)\right|}{\left|\bigcup\limits_{\left\{k\in\mathbb{Z}\right\}\cap[0,r]}\bigcup\limits_{\left\{n\in\text{odd}\right\}\cap[0,t]}V(a,b,k,n)\right|} \qquad \qquad \qquad (3)$$

and created a technique in this link to approximate $D$. However the technique has two problems.

  1. It is unable to set $(r,t)\to\infty$ for fixed $a,b$
  2. The programming won't compute for larger values of $r,t$ and $a,b$

Moreover, no one was able to reduce the computation time of my solution or offer a different method from Mathematica.

I know little about other kinds of programming languages so I will ask this:

What code from other programming languages can efficiently approximate $D$ for large fixed values of $a$, $b$ as $(r,t)\to\infty$?

1

There are 1 best solutions below

5
On

So, let's have a look at $W(a,b,r,t)=\bigcup_k\bigcup_n V(a,b,k,n)$. If we put it into words, $W$ is the collection of rationals in $[a,b]$ whose irreducible form $\frac pq$ satisfies $q\le 2^{\lfloor r\rfloor}\times\lfloor t\rfloor$. To simplify a bit, I'll just write it as $W(a,b,M)=\left\{\frac pq\in[a,b]\mid \gcd(p,q)=1,\ q\le M \right\}$. If you get the prime factorization of $M$, it's easy to go back to the $2^rt$ formulation. Speaking of prime factorization, it's probably the easiest way to go through all elements of $W$ without having to check for potential duplicates. In your mathematica code, you were checking every pairs $(p,q)$ that satisfied $q\le M$ and $p\in[aq,bq]$. Instead I suggest using prime factorization to only check pairs whose gcd is $1$.

Notice that $W(a,b,M)=W(0,b,M)\cup W(a,0,M)$, and if we can find $W(0,b,M)$ we can get $W(a,0,M)$ by computing $W(0,-a,M)$ and multiplying everyone by $-1$. So let's roll with $W(0,b,M)$. Let $P$ be every prime number smaller than $Mb$ and $Q$ all those smaller than $M$. Obviously $Q\subseteq P$, if you pick a subset of coefficients $q_1,\ldots,q_J\in Q$ and integer coefficients $\beta_1,\ldots,\beta_J\ge 1$, let $q=\prod_{j=1}^Jq_j^{\beta_j}$. Then an integer $p$ defines a valid rational $\frac pq$ if and only if there exist $p_1,\ldots,p_I\in P\setminus Q$ with $\alpha_1,\ldots,\alpha_I\ge 1$ such that $p=\prod_{i=1}^Ip_i^{\alpha_i}$. So if you use some while loop and maybe a recursive function, you can go over every valid $(p,q)$ pair only once. Hopefully, you can also test then and there whether the corresponding rational belongs to $T$ or not, which spares you the need to store them in memory.

One problem you could raise is about the cost of listing the prime numbers smaller than some threshold. Honestly, no clue. Depending on what you choose, you could just get a list of them from somewhere. Actually, it's probably better if you can just treat that step as a pre-processing, so you don't really care about its computational cost. Also, if you do this smartly you can just use mathematica and bypass the 5 minutes limitation. For a given $W(a,b,r,t)$, you could split it into smaller subsets and run the computation on each smaller subsets, and add the whole thing up.


Pseudo-code (detailed)

Below is a fairly detailed pseudo-code explaining how to loop over admissible pairs $(p,q)$. I couldn't think of a simpler structure so I wanted to make sure I wouldn't neglect something important in the complexity analysis later on.

I'll assume that given a pair $(p,q)$ you have some way to test if $p/q$ belongs to $T$. Given some $q=\prod_{j=1}^Jq_j^{\beta_j}$, $b$, and $P$, it is easy to test every possible $p$. Let $P'=P\setminus\{q_1,\ldots,q_J\}$. The recursive function below can perform this test.

Function Recursive_P(
    list of prime P',
    current numerator p,
    current denominator q,
    parameter b
    ) :

%%% Initialization
count_T := 0 ;
count_W := 0 ;
Choose any p_i in P'
P' := P'\{p_i} ;    % remove p_i from the list of prime
alpha_i_max := floor( ln( (b*q)/p ) / ln( p_i ) ) ;    % compute the maximum exponent for p_i

%%% Main loop
% test other numerator values when alpha_i == 0
tmp_T,tmp_W := Recursive_P( P', p, q, b) ;    % test the other possible numerators that use the primes in P'
count_T += tmp_T ;    % update the counters
count_W += tmp_W ;

% perform the test when alpha_i > 0
for 1 <= alpha_i <= alpha_i_max :    % loop over admissible exponents
    count_W += 1 ;    % each iteration corresponds to a valid p,q pair in W
    p := p*p_i ;    % the new numerator to test

    if (p,q) in T then :    % performing the test
        count_T += 1 ;
    end if

    tmp_T,tmp_W := Recursive_P( P', p, q, b) ;    % test the other possible numerators that use the primes in P'
    count_T += tmp_T ;    % update the counters
    count_W += tmp_W ;
end for

%%% Return values
return count_T,count_W

Once we have that, it suffices to loop over the possible values of $q$

function Recursive_Q(
    list of primes Q,
    list of primes P,
    current denominator q,
    maximum magnitude M
    ) :

%%% Initialization
count_T := 0 ;
count_W := 0 ;
Choose any q_j in Q
Q := Q\{q_j} ;    % remove q_j from the list of primes
P := P\{q_j} ;
beta_j_max := floor( ln( M/q ) / ln( q_j ) ) ;    % compute the maximum exponent for q_j

%%% Main loop
% perform the test for beta_j == 0
tmp_T,tmp_W := Recursive_P( P, 1, q, b);    % test the numerators for the current q
count_T += tmp_T ;    % update the counters
count_W += tmp_W ;

tmp_T,tmp_Q := Recursive_Q( Q, P, q, M) ;    % test other combinations with the remaining primes in Q
count_T += tmp_T ;    % update the counters
count_W += tmp_W ;

% perform the test for beta_j > 0
for 1 <= beta_j <= beta_j_max :    % loop over admissible exponents
    q := q*q_j ;    % the new denominator to test

    tmp_T,tmp_W := Recursive_P( P, 1, q, b);    % test the numerators for the current q
    count_T += tmp_T ;    % update the counters
    count_W += tmp_W ;

    tmp_T,tmp_Q := Recursive_Q( Q, P, q, M) ;    % test other combinations with the remaining primes in Q
    count_T += tmp_T ;    % update the counters
    count_W += tmp_W ;
end for

%%% Return values
return count_T,count_W

In order to compute your ratio, and assuming you already know $P$ and $Q$, it suffices to call the function Recursive_Q( P, Q, 1, M) once to obtain both the numerator and denominator.


Complexity analysis

The above detailed version shows that we can implement the below pseudo-code

count_W := 0 ;
count_T := 0 ;

for p/q in W(a,b,2^rt):
    count_W += 1 ;
    if p/q in T then :
        count_T += 1 ;
    end if
end for

return count_T / count_W

When it comes to "time complexity" only, this short version is equivalent to the detailed one above. Let's say that testing if a rational $p/q$ belongs to $T$ or not has time complexity $C$. (Note that in your example, we can consider $C=1$.) Then this version has an overall "runtime complexity" $\mathcal O(C\left\lvert W\right\rvert)$.

Now this analysis excluded the required pre-processing steps, which consists in computing the list of primes $P$ and $Q$. We need every prime whose magnitude is at most $2^rt\times\max\{\lvert a\rvert;\ b\}$, I'll denote this value by $M$. I don't know that much about the structure of prime numbers among integers, but at least this is possible in time linear to $M\times\lvert P\rvert$. If you must include pre-processing, you end up with a time complexity of $\mathcal O(C\left\lvert W\right\rvert+M\lvert P\rvert)$.

Now as for memory/space complexity, this only depends on the size of $P$, and the level of recursion we reach. Each call to a "recursive_P/Q" function needs a bit of memory to store their temporary values, and the deeper you will ever go is $\lvert Q\rvert+\lvert P\rvert=\mathcal O(\lvert P\rvert)$. Each instance needs to store its own list of "remaining prime", so in the end the space/memory complexity is $\mathcal O(\lvert P\rvert^2)$. Pre-processing space complexity is only $\mathcal O(\lvert P\rvert)$, so we can ignore it in this case.


OP's previous method

Pseudo-code

W := {}    % empty list (or preferably a set?)

% building W
for k (integer) in [0; r] :
    for n (odd integer) in [0; t] :
        for m integer in [a2^rt; b2^rt] :
            add m / (2^rt) in W
        end for    % m loop
    end for    % n loop
end for    % k loop

delete duplicate rationals in W    % already done for a set


% proceed to count as in my proposal
count_W := 0 ;
count_T := 0 ;
for p/q in W:
    count_W += 1 ;
    if p/q in T then :
        count_T += 1;
    end if
end for

return count_T / count_W

Let's treat the first part as pre-processing, and the later as runtime.

The runtime part has the same time complexity $\mathcal O(C\lvert W\rvert)$ since it is basically the same. (Although I am unclear how efficient the mathematica code OP used actually is.) However this time, the loop explicitly use the whole set $W$ which needs to be stored in memory, so space complexity is $\mathcal O(\lvert W\rvert)$ as opposed to the previous $\mathcal O(\lvert P\rvert^2)$. And for large values of $r$ and $t$, $\lvert P\rvert^2$ should be a lot smaller than $\lvert W\rvert$.

Now let's take a look at the pre-processing part. The three nested loops bring us to a total time-space complexity of $\mathcal O(r\times t\times (b-a)2^rt)=\mathcal O(rtM)$. For larger values of $r$ or $t$, I intuitively think that $\lvert P\rvert$ should be smaller since prime numbers get scarcer and scarcer, but maybe not since it is a weird parametrization... In practice I think that just finding primes should be faster. Even in the off chance it is not faster, the additional step needed to delete duplicates, whether it is handled by a set-like data structure, or done after the fact, it will cost you quite a bit. According to the c++ documentation on their "set" class, expect at least $\mathcal O(rtM\log\lvert P\rvert)$ if you do the deletion properly. In summary for the pre-processing, time complexity of $\mathcal O(rtM\log\lvert P\rvert)$, and I believe $rt>\lvert P\rvert> \log\lvert P\rvert$, so that would be worse than the previous $\mathcal O(M\lvert P\rvert)$. For space complexity, your current mathematica version is $\mathcal O(rtM)$, but you could probably make it easily into $\mathcal O(\lvert P\rvert)$.


Conclusion

If $rt\log\lvert P\rvert \ge \lvert P\rvert$, then my proposed method is both asymptotically faster and uses less space. Actually, the greater the $r,t$ (or $M$) values, the more you gain. If you can split the computation with a pre-processing, it is also feasible to split $W$ into more manageable slices. If you enumerate $W$ directly, it's very easy (your current approach). If you do not fully enumerate $W$, it's possible to split the loop in my method by remembering what prime numbers with what exponent you used, but it's a little more tricky to do that.

If pre-processing/splitting the loop is possible, the greatest gain with my method is less memory usage, but I don't think you're worried about that... Except maybe if your previous method actually used up all available memory. If pre-processing is off the table, my method is much faster in its "pre-processing" steps.