Given a positive integer, find its sparsest binary representation using coefficients in $\{0,\pm 1\}$

148 Views Asked by At

We know that any positive integer can be represented as

$$\displaystyle\sum_{i=0}^\infty a_i \cdot 2^i$$

where $a_i \in \{ 0, \pm 1 \}$. Given an arbitrarily large integer, how can we find its sparsest binary representation, i.e., the combination containing the least possible number of nonzero $a_i$’s?

4

There are 4 best solutions below

2
On BEST ANSWER

One algorithm is to write the number in standard binary. Starting from the $a_0$ term, if you have two terms $a_i,a_{i+1}$ that are both $1$, you can replace them with $a_i=-1,a_{i+2}=1$. If $a_{i+2}$ was already $1$, it is now $2$ and you can carry, setting $a_{i+2}=0, a_{i+3}=1$. Keep going. This will terminate when you get at most one bit higher than you started with, as you know that bit started out zero. As you make two passes through the binary expansion of the number, this is $O(\log n)$

1
On

Write the number in binary (if it is negative write its absolute value) and suppose that there are $k$ blocks of consecutive ones.

start with the answer $2k$.

For every two consecutive blocks with exactly one $0$ between them subtract $1$ to the answer.

For every block of size $1$ that does not have exactly one zero between the next block subtract $1$ to the answer.

The remaining value is your result.

Some c++ code:

#include <bits/stdc++.h>
using namespace std;

const int MAX=100;
vector <int> A(MAX);
vector <int> L; // left end of blocks
vector <int> R; // right end of blocks

void tobin(int x){
    for(int i=0;i<MAX;i++){
        A[i]=x%2;
        x/=2;
    }
}

int main(){
    int n;
    scanf("%d",&n);
    tobin(n);
    for(int i=0;i<MAX;i++){
        if( (i==0 || A[i-1]%2 == 0 ) && A[i]%2 ) L.push_back(i);
        if( (i==MAX-1 || A[i+1]%2 == 0 ) && A[i]%2 ) R.push_back(i);
    }
    int res=2*L.size();
    for(int i=0;i+1<L.size();i++){
        if(R[i]+2== L[i]) res--;
    }
    for(int i=0;i<L.size();i++){
        if(L[i]== R[i] && (i==L.size()-1 || R[i]+2 != L[i]) ) res--;
    }
    printf("%d\n",res);
}
3
On

I think the most efficient representation would be to come up with a representation for sequences of ones in the binary representation.

Then I don't really see how you could beat following heuristics:

  1. a single occurrence of 1 is just that term
  2. double occurrence, for convenience just the terms
  3. three or longer (bits set from $k$ up to $l$ (not inclusive) -> $2^l-2^k$)
  4. a single $0$ in a series of $1$'s can just be subtracted
0
On

Okay, to elaborate on my comment (It works even better than I would have thought):

Let us say we want to represent the number $K$. Using the first n "digits" $a_0,...,a_{n-1}$ you can only create the numbers between $-2^n+1$ (taking all to be $-1$) and $2^n-1$ (taking all as 1). On the other hand, the higher digits $a_n,a_{n+1},...$ and so on do not change the number mod $2^n$. Together this means that either $\sum_{i=0}^{n} a_i 2^i = (K \mod 2^n)$ or $\sum_{i=0}^{n} a_i 2^i = (K \mod 2^n)-2^n$. (Or we are in the edge case that $(K \mod 2^n) =0$.)

But then we can calculate the optimal representation of $(K \mod 2^n)$ and $(K \mod 2^n) -2^n$ recursively: Assume we know the optimal representation of $(K \mod 2^{n-1})$ and $(K \mod 2^{n-1}) -2^{n-1}$.

Now we find the optimal one for $(K \mod 2^n)$. We can either have $(K \mod 2^n) = (K \mod 2^{n-1})+2^{n-1}$, in this case the only representation is the one of $(K \mod 2^{n-1})$ with $a_{n-1} =1$. Or we have $(K \mod 2^n) = (K\mod 2^{n-1})$, in this case we can take either the representation of $(K \mod 2^{n-1})$ and $a_{n-1} = 0$ or the representation of $(K \mod 2^{n-1}) -2^{n-1}$ and $a_{n-1} = -1$, whichever is shorter.

The optimal representation for $(K \mod 2^{n})- 2^n$ can be found in a similar way. Repeat this until you reach $(K \mod 2^n) = K$