I am studying Linear Feedback Shift Registers (LFSRs), which produce sequences of binary numbers.
One way of generating the sequence is to update the $n$-bit internal state "register" $r$ by left-multiplying it with an $(n \times n)$ binary matrix $M$ (in GF(2)):
$$ r_{k+1} = Mr_{k} $$
The binary output sequence ($b_0, b_1, \dots$) can then be taken from any bit in $r_{k+1}$, as long as you choose the same bit for every $k=0,1,2,\dots$ (and as long as the initial state $r_0$ is not all zeros).
The particular structure of $M$ I am interested in is the Galois (right-shift) form:
$$ M=\left[ \begin{array}{ccccc} c_{0} & 1 & 0 & \cdots & 0 \\ c_{1} & 0 & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 0 \\ c_{n-2} & 0 & \cdots & 0 & 1 \\ c_{n-1} & 0 & \cdots & \cdots & 0% \end{array}% \right] $$
The value of $c_{n-1}$ is guaraneed to be 1. All other $c_0, c_1, \dots, c_{n-2}$ may be either 0 or 1.
I became interested in trying to calculate the initial state $r_0$, based on the observation of the first $n$ output bits $b_0, b_1, \dots, b_{n-1}$. Specifically, I am interested in the case where the output bit $b_k$ is taken from the last element of $r_{k+1}$.
This means that: $$ b_0 = last\_row\{M\}r_0 \\ b_1 = last\_row\{M^2\}r_0 \\ \vdots \\ b_{n-1} = last\_row\{M^n\}r_0 $$
So I wrote this as a matrix equation (in GF(2)): $$ b = Br_0 $$
Where $b$ is the column vector $[b_0, b_1, \dots, b_{n-1}]^T$ and $B$ is the $(n \times n)$ binary matrix formed by stacking the last row of $M^k$ for $k=1,2,\dots,n$. Then I simply solve: $$ r_0 = B^{-1}b $$
When I looked at the structure of $B^{-1}$ I saw a remarkably simple pattern. It's a binary Toeplitz matrix, where the main diagonal is always 1 and the non-zero (sub)diagonals come directly from the $c_0, c_1, \dots, c_{n-2}$ values in $M$.
For example, if $[c_0, c_1, c_2, c_3] = [1,1,0,1]$ then the 1st and 2nd subdiagonals of $B^{-1}$ are 1 (because the 1st and 2nd elements of $c$ are 1): $$ M=\left[ \begin{array}{cccc} 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0% \end{array}% \right] $$
$$ B^{-1}=\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 0 \\ 0 & 1 & 1 & 1% \end{array}% \right] $$
Similarly, if $[c_0, c_1, c_2, c_3, c_4] = [0,1,1,0,1]$ then the 2nd and 3rd subdiagonals of $B^{-1}$ are 1 (because the 2nd and 3rd elements of $c$ are 1):
$$ M=\left[ \begin{array}{ccccc} 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0% \end{array}% \right] $$
$$ B^{-1}=\left[ \begin{array}{ccccc} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 1% \end{array}% \right] $$
I checked this for some millions of cases and it didn't fail. I feel like I am missing a simple explanation for why this must always be true. Have I maybe over-complicated matters by involving a matrix inverse? Can this result be proved to hold in general?
EDIT: I have written a (matrix) proof based on "State Recovery with LFSR Shifting, Method B" described at this link. However, this was several pages of work, so I would really like to see if there is a more direct way of proving this result.