Determine whether a random binary sequence was generated by human or natural process

Question

Determine whether a random binary sequence was generated by human or natural process

2.2k Views Asked by Bumbble Comm At 31 Mar 2026 - 6:19

Given a binary sequence, how can I calculate the quality of the randomness?

Following the discovery that Humans cannot consciously generate random numbers sequences, I came across an interesting reference:

"A professor of probability would have half of his class write down a list of zeros and ones from a bunch of coin tosses and the other half write down a list of zeros and ones that they would try to make look random. The professor could sort out the papers with only a glance at each paper."

What method would be suitable to replicate the professor's act, i.e. judging which sequence is likely to be generated by a random process?

A method I have in mind is: for a given sequence-length $ n $, establish the frequencies of each possible sub-sequence in the population of all possible sequences and than compare these with a particular sequence.

As this very quickly becomes impractical (the number of sub-sequences grows exponentially with $ n $) the method may, instead, only measure a subset of all sub-sequences, e.g. all sequences of same digit.

How good would such a method work? What are some better methods?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Jun 2016 - 12:28

In a random sequence of $n$ tosses of a fair coin,

the expected total number of runs is $\frac{n+1}{2}$
the expected number of runs of length $L$ or more is $\frac{n+2-L}{2^L}$ when $L \le n$
the expected number of runs of exactly length $L$ is $\frac{n+3-L}{2^{L+1}}$ when $L \lt n$
the expected number of runs of exactly length $n$ is $\frac{1}{2^{n-1}}$

An approximation to this is to say that:

the total number of runs will be about half the total number of tosses
about half the runs will be of length $1$
the number of runs of exactly length $L$ will be about half the number of length $L-1$
the number of runs of at least length $L$ will be about equal to the number of length $L-1$

So for example with $100$ tosses,

the expected total number of runs is $50.5$
the expected number of runs of length $1$ is $25.5$
the expected number of runs of length $2$ is $12.625$
the expected number of runs of length $3$ is $6.25$
the expected number of runs of length $4$ or more is $6.125$

and it would not be too difficult to judge this visually

**user347499** · Accepted Answer

Although, I do think it might be more fitting to ask somewhere else, I will try to answer this from a mathematical perspective.

As you noted, counting subsequences can get impractical, and doesn't, in a reasonable pratical environment, give ideal results, instead a varity of other techniques can be used. These are known as randomness tests.

Your example is one of many randomness tests. Other well-known tests are (I take the liberty to reframe your question in terms of sequence, $ f : \mathbb{N} \to \mathbb{Z}_n $ for some $ n $),

The Wald–Wolfowitz runs test. Extract uniformly distributed reals on the interval $ (0, 1) $. Count ascending and descending runs. We know that the mean ought to be $ \frac{2 N_{+} N_{-}}{N} + 1 $. It follows that the variance will ideally be $ \sigma^2 = \frac{(\mu - 1)(\mu - 2)}{N - 1} $ (with $ N_+ $ being number of ascending runs and $ N_- $ being the number of descending runs, and $ N = N_+ + N_- $).
The overlapping sum test. Generate $ n $ reals on $ (0, 1) $, add sequences of $ m < n $ consecutive entries. The sums should then be normally distributed with characteristic mean and variance.
Matrix rank test. Through the random sequence, form a matrix over $ \mathbb{Z}_n $, then determine the rank of the matrix, and examine the distribution.

To make it even more effective, one can add a number of classes of bijective combining functions, $ g_m: \mathbb{Z}_n \to \mathbb{Z}_n $. One can then construct a new sequence by $ f'(n) = g_n(f(n)) $ and perform the randomness tests on $ f' $. Examples of such function classes include,

Adding sequent elements, $ g_n(x) = x + g_{n - 1}(f(n - 1)) $, with $ g_0(x) = x $.
Constant addition, $ g_n(x) = x + c $ for some $ c $.
Multiplying by some $ p $ relatively prime with the modulo, $ g_n(x) = px $.

and so on.

This is used as a tool for extracting patterns (e.g. weakening the function by identifying and abusing its patterns). Since a truely random sequence is pattern-less, this can be used for "measuring" randomness.

Determine whether a random binary sequence was generated by human or natural process

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in RANDOM-VARIABLES

Related Questions in RANDOM

Trending Questions

Popular # Hahtags

Popular Questions