Inversion of the Burrows Wheelers Transform

187 Views Asked by At

The "Burrows-Wheeler Transform" in signal processing is a transformation which is used in for instance data compression and pattern recognition.

It can be described in mathematical terms as:

Start with a column vector $\bf v$.

  1. Build the circulant matrix $\bf C_v$ which has ${\bf v}^t$ as it's first row and all subsequent rows are all possible circular shifts.
  2. Perform a lexical sort on the rows ( series of row operations ).
  3. Store the last column together with the index for the first (or last) element of the initial sequence. Let us call it $\bf b$.

Is it reasonable that there would exist a permutation matrix $\bf P$ which we can find with the information above, such that we can determine the initial $\bf v$ as a polynomial ${\bf v} = \sum_{k=0}^{n} \bf P^k {\bf b}$?

2

There are 2 best solutions below

2
On BEST ANSWER

I would say basically no. $|{\bf v}| = |{\bf b}|$. If we assume all elements of $\mathbf{v}$ are positive, then $\left|\sum_{k=0}^{n} {\bf P}^k {\bf b}\right| > |{\bf v}|$ for $n > 0$.

1
On

As the question stands, I do not think such a sum equality to be possible, see for instance NovaDenizens answer above.

It is possible to create a permutation matrix $\bf P$ such that the last element of $\bf P^kb$ is the $k$:th position of the inverse transform. I.e. we build a vector $\bf v$ such that ${\bf v}_k = ({\bf P^kb})_{end}$. So in some sense this $\bf P$ could be viewed as a generating element for the inverse-transformed string. Since ${\bf P^{k+1}b} = \bf{P (P^k b)}$, we can store the vector ${\bf P^k b}$ and so we just have to multiply to the left with $\bf P$ once for every new letter.


  1. First sort the vector b and note the order of the sorting.
  2. The rows position will just be the order of this sorting.

Assuming aIn is the encoded string, the following Matlab / Octave code explains:

[~,i_] = sort(aIn);

P = sparse(i_,1:numel(i_),ones(numel(i_,1)));


But we can of course also produce and store powers of $\bf P$. This could be beneficial if we have multi-core processing. Imagine having ${\bf v}_1$ through ${\bf v}_8$ and ${\bf P}^8$ stored. Since permutation matrices remain permutation matrices with multiplication, then 8 independent processes could generate separate parts of our string, and they would also be guaranteed to run equally fast.


So if we reformulate the question to

"Is it possible to find a permutation matrix $\bf P$ such that if

  1. $\bf C$ is forward rotation 1 step ( generating element of the cyclic group: ${\bf C} = \left[\begin{array}{cc} 0&1\\ {\bf I}&0 \end{array}\right]$),
  2. $\bf D$ is the diagonal matrix with 1 lower right corner

then we can recover the original message: $$\sum {\bf C}^k{\bf DP}^k{\bf b} = {\bf v}$$