Non-integer low discrepancy sequence

464 Views Asked by At

I am looking into low-discrepancy random number generators, and many of them work by generating a random integer which is then normalised (e.g. usually by dividing by $2^{32}$). Examples include Sobol, Niederreiter.

I am trying to find a random number generator which produces low-discrepancy sequences which does not produce a uniform random number by a final division by a power of two. Example of these include MCG31m1 (L'Ecuyer 1999), MRG32k3a, Wichmann-Hill, but these are not low discrepancy.

I know other low discrepancy sequences include Halton, Hammersley, van der Corput, Faure, but I don't now enough about their algorithms to know if they normalise by a power of two or work by using some intermediate integers. Ideal I would like to find one where it's algorithm (or implementation) naturally produces a random uniform, rather than a random integer (should be low discrepancy).

1

There are 1 best solutions below

0
On BEST ANSWER

There are many ways to construct low discrepancy sequences. We can group them according to how their core method of construction, as follows:

  1. Irrational fractions: Kronecker, Richtmyer, Ramshaw
  2. Prime numbers: Van der Corput, Halton, Faure
  3. Irreducible Polynomials :Niederreiter
  4. Primitive polynomials: Sobol

As you have already investigated Sobol and Niederetter sequences, this post describes how to construct sequences in the first class (Kronecker) and second class (Van der Corput / Halton) and show that neither of these depend on special integer powers.

A basic description of all these low discrepancy sequences can be found here.

Kronecker Sequences.

The one dimensional Kronecker additive recurrence sequence is defined as: $$t_n = \{\alpha_0 + n \alpha\}, \quad n=1,2,3,...$$ where $\alpha$ is any irrational number. Note that the value of $\alpha_0$ does not affect the core characteristics of the sequence and is comparable to the 'seed' when used for computing random numbers, and is often set to zero.

Also note that the mathematical notation $\{x\} $ indicates the fractional part of $x$. In computing, this function is more commonly expressed in the following way

$$t_n = \alpha_0 + n \alpha \; (\textrm{mod} \; 1); \quad n=1,2,3,...$$

Weyl proved that in the limit of $n \rightarrow \infty$, any irrational value of $\alpha$ will produce an equidistributed sequence. That is, not only is the range of infinite sequence $\{t_n\}$ dense on the interval $[0,1]$, but all finite subintervals values in this range are equally likely.

A $d$-dimensional Kronecker sequences can be constructed merely by constructing separate and independent 1-dimensional sequences for each of the $d$ components. $$ \pmb{x}_n = \pmb{\alpha}_0 + n\pmb{\alpha} \; (\textrm{mod} \; 1), \quad n=1,2,3,... $$

In such cases, the square roots of the first $d$ prime numbers is nearly always chosen, as a sensible and convenient (but not necessarily optimal) way to ensure the irrationality criterion. That is, $$\pmb{\alpha} = (\alpha_1, \alpha_2, ...\alpha_d) = (\sqrt{2},\sqrt{3},\sqrt{5}, \ldots) .$$

Anyway, back to the just the 1-dimensional sequence, again...

Equidistribution is actually quite a weak criterion and does not necessarily imply that the sequence will exhibit low discrepancy. It turns out that the values of $\alpha$ that produce low discrepancy sequences are those real values that are badly approximable. For number theoretic reasons, these special values of $\alpha$ are also correspond to numbers that are badly approximable (by the rationals). Intuitively one can say that the more badly approximable a number is by the rationals, the 'more irrational' it is. This is why the golden ratio, which is the most badly approximable number, is often described as the most irrational number. The more irrational the value of $\alpha$ is, the better ('more even') the low discrepancy sequence will be. Thus, the golden ratio $\phi = \frac{1+\sqrt{5}}{2} $ produces the optimal low discrepancy sequence.

Two lesser known facts are worth mentioning.

Firstly, any value of $\alpha$ related via the transformation, $$ \alpha = \frac{a+b\phi}{c+d\phi} \quad \textrm{for integers} \;\;a,b,c,d \;\; \textrm{where} \; |ad-bc|=1.$$ will also be optimal.

And secondly, $\alpha = 1+\sqrt{2} $ turns out to be the next most badly approximable number -- and therefore the next most optimal value for the construction of a low discrepancy sequence.

Springborn as well as Spalding give number theoretic reasons why this is the second most badly approximable number, and also why $\alpha = \frac{1}{10} (9+\sqrt{221})$ is the third-most badly approximable number.

Interestingly the continued fraction for these three values are: $$ \varphi = 1+\cfrac{1}{1+\cfrac{1}{1+\cfrac{1}{1+...}}} = [1;1,1,1,...]$$ $$ 1+\sqrt{2} = 2+\cfrac{1}{2+\cfrac{1}{2+\cfrac{1}{2+...}}} = [2;2,2,2,2,2] $$ $$ \frac{1}{10}(9+\sqrt{221}) = 2+\cfrac{1}{2+\cfrac{1}{1+\cfrac{1}{1+...}}} = [2;2,1,1,2,2,1,1,2,2,1,1,..] $$

Van der Corput Sequences.

The construction of Van der Corput sequences is often described reverse radix. As per the description here, "to get the decimal van der Corput sequence, we start by dividing the numbers 1 to 9 in tenths (x/10), then we change the denominator to 100 to begin dividing in hundredths (x/100).

In terms of numerator, we begin with all two-digit numbers from 10 to 99, but in backwards order of digits. Consequently, we will get the numerators grouped by the end digit.

Firstly, all two-digit numerators that end with 1, so the next numerators are 01, 11, 21, 31, 41, 51, 61, 71, 81, 91. Then the numerators ending with 2, so they are 02, 12, 22, 32, 42, 52, 62, 72, 82, 92. An after the numerators ending in 3: 03, 13, 23 and so on...

Thus, the sequence begins

$$\frac {1}{10},\frac {2}{10},\frac {3}{10},\frac {4}{10},\frac {5}{10},\frac {6}{10},\frac {7}{10},\frac {8}{10},\frac {9}{10},\frac {1}{100},\frac {11}{100},\frac {21}{100},\frac {31}{100},\frac {41}{100},\frac {51}{100},\frac {61}{100},\frac {71}{100},\frac {81}{100},\frac {91}{100},\ldots $$

For computing purposes the most common base is not base 10, but rather base 2.

The Halton sequence is merely the van der Corput sequence generalized to higher dimensions. The $d$-dimensional Halton sequence is constructed by combining $d$ independent van der Corput sequences for each the $d$ components. To ensure independence, each of the $d$ bases must be pairwise co-prime. For convenience, in nearly all situations the first $d$ prime numbers are chosen as the bases.

Summary

There are many methods to calculate low discrepancy sequences and most of them generalize to higher dimensions, including Sobol, Niederreiter, Halton and Kronecker. However, Sobol and Niederreiter sequences are derived by calculating a low discrepancy integer sequence and diving by fixed power. The Van der Corput sequence is also merely a sequence of rational numbers with all the denominators being a power of the given base, $b$.

Therefore, given your interest in what you refer to as 'an algorithm that naturally produces a random uniform rather than a random integer', I would say that the Kronecker-based methods are most preferable, as all the terms are intrinsically irrational.

I have written a blog post, "The unreasonable effectiveness of quasirandom sequences", that not only compares many of them, but also introduces a new Kronecker method based on a a generalization of the golden ratio, that suffers from less technical issues that are associated with many of these existing contemporary methods. Below is a visualization of five of the main methods when applied to two dimensions.

Comparison of various low discrepancy methods in two dimensions.