How to calculate power series with a large norm matrix.

342 Views Asked by At

Take this matrix representing the Markov map for seeing a Heads followed by a Tails in a sequence of fair coin flips: $$M = \begin{pmatrix} \frac{1}{2} & \frac{1}{2} & 0 \\ \frac{1}{2} & 0 & \frac{1}{2} \\ 0 & 0 & 1 \end{pmatrix}$$ Since the final state is absorbing we can reason that $$\lim_{n\rightarrow \infty} M^n = L = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 1 \end{pmatrix}$$

I am interested in the following limit of a sum for matrices like this. $$\lim_{n \rightarrow \infty } \left( n M^n - \sum_{i=0}^{n-1}M^i \right)$$ It would seem that I can easily replace $M^n$ with $L$ in the first term but what to do with the actual sum bit??

Ordinarily I'd look at that and think of the power series, except that the norm of $M$ is too big for that to work.

It seems (from empirical evidence) that in this case I get $$ \lim_{n \rightarrow \infty} \sum_{i=0}^{n-1}M^i = \begin{pmatrix} 4 & 2 & n-6 \\ 2 & 2 & n-4 \\ 0 & 0 & n \end{pmatrix}$$

This means that I can do the limit from above, indeed I get $$ \lim_{n \rightarrow \infty } \left( n M^n - \sum_{i=0}^{n-1}M^i \right) = \begin{pmatrix} -4 & -2 & 6 \\ -2 & -2 & 4 \\ 0 & 0 & 0 \end{pmatrix}$$

But how can I do this in general (without a lot of numerical computation each time)? I've tried a few different such Markov maps and I always seem to get convergence even though my matrices always have a norm larger than one. I'd appreciate some help!

Thanks in advance.

UPDATE: See greg's post below. Basically if we assume there is a limit (say $S$ for the limit of the partial sums) we can do the following:

$$ n M^n - \sum_{i=0}^{n-1}M^i = \sum_{i=0}^{n-1} (M^n - M^i) $$ Then $$ S = \lim_{n\rightarrow \infty} \sum_{i=0}^{n-1} (M^n - M^i) $$ If you take that equation and multiply by $M$ and subtract $I$ you get

$$ S - L = M S - I \Rightarrow (M - I)S = I - L$$

which is the same as what greg had. The problem is that you can't always take the inverse of $(M-I)$; in the cases I considered it is singular.

At this point I noticed that I can always expect the last row of $S$ to be the zero vector as long as $M$ is indeed a right stochastic matrix (as is the case for what I'm doing). In fact the last row of $M$ in an absorbing Markov map is always going to be ${0,\ldots,0,1}$. So in some sense I'm really mostly interested in the upper rows of the matrix $M$.

Introduce $T$ where $T$ is the identity matrix with the last row zeroed out.

$$ T = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix} $$

Then $ S = T S $ and we can replace $M \rightarrow T M$ and $ L \rightarrow T L$ and follow the above logic to get $$ (T M - I) S = T - T L = I - L $$ Now we can invert $(T M - I)$ and solve for the limit $S$

$$ S = (T M - I)^{-1} (I - L) $$ I don't know why $(T M - I)$ should always be non-singular but it worked for me for many different choices of $M$ as right-stochastic matrices with the last row ending with one.

I feel like I've answered my own question, although greg's post contains most of this information (before I had a chance to write this all down).

2

There are 2 best solutions below

6
On BEST ANSWER

Go ahead and substitute $L$, and your sum becomes $$ \sum_{i=0}^{n-1} (L-M^i) $$ There is some finite index $k$ for which $M^k$ is indistinguishable from $L$. So terms beyond that index will contribute nothing to the sum. Once you determine what the cut-off value is, you are left with a finite summation $$ \sum_{i=0}^{k} (L-M^i) $$

Update

Denote the $k^{th}$ partial sum as $$ S_k = \sum_{i=0}^{k} (L-M^i) $$ Using the fact that $ML=LM=L$ we easily find that $LS_k=S_kL=0$.

With a little more effort we can find $$ \eqalign{ S_kM = MS_k &= M\sum_{i=0}^{k} (L-M^i) \cr &= \sum_{i=0}^{k} (LM-M^{i+1}) \cr &= \sum_{i=1}^{k+1} (L-M^{i}) \cr &= (L-M^{k+1}) - (L-I) + \sum_{i=0}^{k} (L-M^i) \cr &= (I-L) + S_k \cr }$$ where $L=M^{k+1}$ for a large enough $k$.

Finally we can solve for $S_k$ $$ \eqalign{ (M-I)S_k &= (I-L) \cr S_k &= (M-I)^{+}(I-L) \cr }$$ This should be the solution, but the solution found in the problem statement turns out to be this $$ \eqalign{ S_k &= (I-L)(M-I)^{+}(I-L) \cr }$$ But I don't see why I'm required to multiply by $(I-L)$.

0
On

Let $\chi(x)$ be the characteristic polynomial for $M$ and $\lambda_1, \lambda_2, \lambda_3$ be its eigenvalues.
WOLOG, we will assume $|\lambda_1| \ge |\lambda_2| \ge |\lambda_3|$. We have

$$\chi(x) \stackrel{def}{=} \det( xI_3 - M ) = x^3 - \frac32 x^2 + \frac14 x + \frac14 = \prod_{i=1}^3 (x-\lambda_i) \;\text{ with }\; \begin{cases} \lambda_1 &= 1\\ \lambda_2 &= \frac{1+\sqrt{5}}{4}\\ \lambda_3 &= \frac{1-\sqrt{5}}{4} \end{cases}$$

The key is $1$ is an eigenvalue and the three eigenvalues $\lambda_i$ of $M$ are all simple.

Apply the Lagrange interpolating formula to the constant polynomial $1$ at the 3 points $\lambda_i$, we have

$$1 = \sum_{i=1}^3 \frac{1}{\chi'(\lambda_i)}\prod_{j=1,\ne i}^3( x - \lambda_j) = \sum_{i=1}^3 \prod_{j=1,\ne i}^3 \frac{x - \lambda_j}{\lambda_i - \lambda_j} $$

Lift this polynomial identity to the polynomial ring generated by $M$, we obtain a decomposition of the identity matrix $I_3$.

$$I_3 = \sum_{i=1}^3 \Delta_i \quad\text{ where }\quad \Delta_i \stackrel{def}{=} \frac{1}{\chi'(\lambda_i)}\prod_{j=1,\ne i}^3 ( M - \lambda_j I ) = \prod_{j=1,\ne i}^3\frac{ M - \lambda_j I }{\lambda_i-\lambda_j}$$

Recall Cayley-Hamilton theorem and notice $$\chi(x) = (x-\lambda_i) \prod_{j=1,\ne i}^3(x - \lambda_j) \quad\text{ for } i = 1,2,3 $$ We find $$\chi(M) = 0 \quad\implies\quad ( M - \lambda_i )\Delta_i = 0 \quad\iff\quad M \Delta_i = \lambda_i \Delta_i$$ This in turn implies for any polynomial $p(x)$, we have $$p(M) \Delta_i = p(\lambda_i) \Delta_i, \quad\text{ for } i = 1,2,3$$ One consequence of this is

$$\Delta_i \Delta_j = \left(\prod_{k = 1,\ne i}^3\frac{M - \lambda_k I_3}{\lambda_i - \lambda_k} \right)\Delta_j = \left(\prod_{k = 1,\ne i}^3\frac{\lambda_j - \lambda_k}{\lambda_i - \lambda_k}\right)\Delta_j = \begin{cases}\Delta_j, & j = i\\ 0 & \text{otherwise}\end{cases}\tag{*1} $$ Another consequence is we can reduce any polynomial $p(M)$ in $M$ to a quadratic one:

$$p(M) = p(M) I_3 = p(M)\sum_{i=1}^3 \Delta_i = \sum_{i=1}^3 p(\lambda_i)\Delta_i\tag{*2} $$

Apply $(*2)$ to polynomial $p_n(x) = nx^n - \sum_{k=1}^{n-1} x^k$ and notice $p_n(1) = 0$, we find

$$nM^n - \sum_{k=1}^{n-1} M^k = p_n(M) = p_n(\lambda_2)\Delta_2 + p_n(\lambda_3)\Delta_3$$

Notice $|\lambda_2|, |\lambda_3| < 1$ and for any $|\lambda| < 1$,

$$\lim_{n\to\infty} p_n(\lambda) = \lim_{n\to\infty} \left( n \lambda^n - \frac{\lambda^n -1 }{\lambda - 1} \right) = \frac{1}{\lambda - 1}$$ We find

$$ \lim_{n\to\infty} p_n(M) = \frac{\Delta_2}{\lambda_2-1} + \frac{\Delta_3}{\lambda_3-1} = \left[\frac{M-\lambda_3}{(\lambda_2-1)^2(\lambda_2 - \lambda_3)} + \frac{M-\lambda_2}{(\lambda_3-1)^2(\lambda_3 - \lambda_2)}\right](M-I_3) $$ Using the explicit values of $\lambda_2, \lambda_3 = \frac{1\pm\sqrt{5}}{4}$, it is not hard to check $$\frac{x-\lambda_3}{(\lambda_2-1)^2(\lambda_2 - \lambda_3)} + \frac{x-\lambda_2}{(\lambda_3-1)^2(\lambda_3 - \lambda_2)} = 24x + 8$$ This leads to $$\begin{align}\lim_{n\to\infty} p_n(M) = (24M + 8I_3)(M - I_3) &= \begin{bmatrix}20 & 12 & 0\\ 12 & 8 & 12\\ 0 & 0 & 32\end{bmatrix} \times \frac12 \begin{bmatrix}-1 & 1 & 0\\ 1 & -2 & 1\\ 0 & 0 & 0\end{bmatrix}\\ &= \begin{bmatrix}-4 & -2 & 6\\ -2 & -2 & 4\\ 0 & 0 & 0\end{bmatrix} \end{align} $$ Reproducing what you get numerically.

Alternatively, using $(*2)$, we have $$M = \sum_{i=1}^3 \lambda_i \Delta_i \quad\implies\quad M - I + \Delta_1 = \Delta_1 + (\lambda_2 - 1)\Delta_2 + (\lambda_3 - 1)\Delta_3$$ Combine with $(*1)$, we have

$$(M - I + \Delta_1)\left( \Delta_1 + \frac{\Delta_2}{\lambda_2 - 1} + \frac{\Delta_3}{\lambda_3 - 1}\right) = \Delta_1 + \Delta_2 + \Delta_3 = I_3$$

From this, we get an alternate way to evaluate the limit $$\begin{align} \lim_{n\to\infty} p_n(M) &= \frac{\Delta_2}{\lambda_2 - 1} + \frac{\Delta_3}{\lambda_3 - 1} = (M - I + \Delta_1)^{-1} - \Delta_1\\ &= \begin{bmatrix}-\frac12 & \frac12 & 1\\ \frac12 & -1 & \frac32\\ 0 & 0 & 1\end{bmatrix}^{-1} - \begin{bmatrix}0 & 0 & 1\\ 0 & 0 & 1\\ 0 & 0 & 1\end{bmatrix} = \begin{bmatrix}-4 & -2 & 6\\ -2 & -2 & 4\\ 0 & 0 & 0\end{bmatrix} \end{align} $$