Intuition on Martin-Löf-Test for finite strings

273 Views Asked by At

The followng example is from An Introduction to Kolmogorov Complexity and Its Applications, Example 2.4.1. and is concerned with Martin-Löf-Tests for finite strings:

A string $x_1 x_2 \ldots x_n$ with many initial zeros is not very random. We can test this aspect as follows. The special test $V$ has critical regions $V_1, V_2, \ldots$. Consider $x = 0.x_1 x_2\ldots x_n$ as a rational number, and each critical region as a half-open interval $V_m = [0, 2^{-m})$ in $[0,1)$, $m = 1,2,\ldots$. Then the subsequent critical regions test the hypothesis "$x$ is random" by considering the subsequent digits in the binary expansion of $x$. We reject the hypothesis on the significane level $\epsilon = 2^{-m}$ provided $x_1 = x_2 = \ldots = x_m = 0$.

Now I have a problem with intuition. Consider some very irregular, uncompressible string $w$ of length $n$ which starts with zero, then $w \notin V_1$ and hence would be classified as not random by this test, just because its starts with zero but otherwise is very irregular (hence very "random")?

1

There are 1 best solutions below

2
On BEST ANSWER

The crucial point is the level of significance of the test, if you consider $\epsilon = 2^{-1}$ then you are rejecting the null conjecture (It is a random array) in many cases. For some reason you are bound to think that many $0$ may occur. So you will reject randomness at the event the first $m$ digits are zero.

You could also think that it is not very likely that a random number will have many $1$ in its dyadic decomposition. So you would consider as evidence that it is not a random number if the first $m$ digits are all $1$.

This test is going to reject randomness at level $\epsilon = 2^{-1}$ just because its starts with $1$ but otherwise will consider it as very irregular.

As any particular region of the line is arbitrary. Your criterion of a random number will always be insufficient once random numbers may indeed belong to that region ($V_m$ for instance) and non-random numbers also belong to that region.

It is a limitation we must deal with.