Expected number of 1s for a random integer

198 Views Asked by At

For an integer $K$ randomly chosen from $0,1,...,N$. What is the expected number of ones in $K$'s binary representation?

A special case of the problem is when $N=2^k - 1$, in which the expected number of ones is $\frac{k}{2}$.

2

There are 2 best solutions below

0
On

There is no very simple expression like you have for $N=2^k-1$ but that points you in the right direction. Let us define $q(N)$ as the total number of $1$'s in $[0,N]$, so your average is $\frac {q(N)}{N+1}$. As you say, if $N=2^k-1, q(N)=k2^{k-1}$. Otherwise, if $k$ is the largest number such that $2^k-1 \lt N$, we have $q(N)=k2^{k-1} + (N-2^k+1) + q(N-2^k+1)$. The first term accounts for all the numbers $[0,2^k-1]$, the second accounts for the leading bit of the numbers $[2^k,N]$ and the third has all the bits after the first from $[2^k,N]$

0
On

The math works out more tidily if we use a half-open interval and ask "what is the expected number of ones in the binary representation of a randomly selected integer from $[0, N)$?" As you say, if $N = 2^p$, the answer is $\frac{p}{2}$.

In general, let

$$N = \sum_{i=1}^{k}2^{p_i} \space \space \small{( p_1 > p_2 > ... > p_k)}$$

which is just the binary expansion of $N$. Then we can compute the expected number of ones as:

$$ \large\frac{\sum\limits_{i=1}^{k} {2^{p_i}(\frac{p_i}{2}+i-1)}}{\sum\limits_{i=1}^{k}2^{p_i}} $$

where the numerator is the total number of ones in all the integers in the range and the denominator is $N$. Each value in the sum in the numerator corresponds to a range of values $M, M+1, ..., M+2^p - 1$ where $M$ has exactly $i-1$ one bits, all of which are larger than $2^p$.

Note that if $k$ is $1$, which is to say that $N$ is a power of two, this reduces, as expected, to:

$$ \frac{{2^p(\frac{p}{2})}}{2^p} = \frac{p}{2}$$


Here's the computation in Python:

def ones(n):
  bits = bin(n)[2:]
  l = len(bits) - 1
  p=[l-i for i in range(l + 1) if bits[i] == '1']
  # Avoid floating point, for aesthetics sake.
  return sum(2**p[i] * (2*i + p[i]) for i in range(len(p))) / 2

def expected_ones(n):
  return ones(n) / float(n)

And here's a graph (but not a pretty one): gnuplot special