Markov Chain Transition Matrix eigenvalues - Textbook Exercise

75 Views Asked by At

I am trying to solve the following problem taken the book Markov Chains and Mixing Times 2nd edition (Exercise 12.2):

Let $P$ be irreducible transition matrix, and suppose that A is a matrix with $0 \le A(i,j) \le P(i,j)$ and $A \ne P$. Show that any eigenvalue $\lambda$ of $A$ satisfies $|\lambda| < 1$.

Now I know that $P$'s biggest eigenvalue is 1. I have tried several approaches but I am not sure how to use irreducibility in this context. Maybe if the chain has some period it is possible to look on a chain with transition matrix raised to the power of the period to get aperiodic chain?
Help will be appreciated.

Link to the book: https://pages.uoregon.edu/dlevin/MARKOV/markovmixing.pdf

2

There are 2 best solutions below

0
On BEST ANSWER

use the average instead
$S_A =\frac{1}{n}\sum_{k=0}^{n-1} A^k \leq \frac{1}{n}\sum_{k=0}^{n-1} P^k = S_P$
where the inequality is strict in at least one component. Note that $S_P\mathbf 1 = \mathbf 1$ and $S_A$ has a (Perron) eigenvalue $\geq 1$ iff $A$ does. Since $P$ is $n\times n$ and irreducible, $S_P$ is a positive matrix.

proof 0 (w/ analysis)
Construct the monotone sequence of matrices $S_k := \frac{1}{k}\cdot S_p + \frac{k-1}{k}\cdot S_A$ for $k \in \mathbb N$
$\implies S_p = S_1 \geq S_2 \geq S_3\geq \dots$
where the inequality holds point-wise and is strict in at least one component for each $k$ and all $S_k$ are positive matrices. Perron Theory then tells us that the Perron roots form a strictly monotone decreasing sequence
$1=\lambda_1^{(S_1)} \gt \lambda_1^{(S_2)}\gt \lambda_1^{(S_3)} \gt \dots$
$\implies 1 \gt \lambda_1^{(S_A)}$
e.g. by continuity of coefficients of the characteristic polynomial and Hurwitz from complex analysis. Alternatively there is a real analysis argument for taking the limit of the sequence in Meyer's "Continuity of the Perron Root" that gives the result as well (ref http://carlmeyer.com/pdfFiles/ContinuityOfThePerronRoot.pdf or http://arxiv.org/abs/1407.7564 )

proof 1 (no analysis)
$A$ decomposes into $r$ irreducibles $\implies S_A$ decomposes into a collection of $r$ irreducibles (i.e. it is permutation similar to a block lower triangular matrix).

Each irreducible of $S_A$ is either nilpotent [hence eigenvalues $=0$ class] or a positive matrix [each element in the irreducible of $A$ communicates with all others]. In the latter case, we are dealing with positive matrices whose row sums are $\leq 1$ and this is strict in at least one row [i.e. if $r\geq 2$ the irreducible is at most $n-1\times n-1$ and its row sum is bounded above by the sum across rows the same submatrix of $S_P$ which is $\lt 1$ since $S_P$ is a positive stochastic matrix; if $r=1$ then $S_A \leq S_P$ where the inequality is strict in at least one component]. Each irreducible [submatrix] $M$ of $S_A$ has a single Perron root $\lambda$ and $M \mathbf x = \lambda \mathbf x$ for some positive vector $\mathbf x$ but each element of $M\mathbf x$ is a 'complete' (sub)convex combination (i.e. all weights positive summing to at most $1$) of elements of $\mathbf x$ hence $\lambda \gt 1$ is impossible and if $\lambda =1$ then $M\mathbf x = \mathbf x \implies \mathbf x \propto \mathbf 1$ since $\mathbf x$ cannot have a maximum [i.e. if $\max_i x_i \gt \min_k x_k \gt 0$ then that max cannot be written as a complete (sub)convex combination of elements in $\mathbf x$]. But since at least one row of $M$ sums to $\lt 1$ then we know $M\mathbf 1 \neq \mathbf 1\implies \lambda \lt 1$. This is a maximum principle type finish; another way to finish would be to stick $M$ onto the leading principle submatrix of a matrix of zeros that has one extra column and row -- then turn that matrix into an absorbing state chain by incrementing the zeros in the relevant column so it is a stochastic matrix with a single absorbing state and a transient chain given by $M$.

proof 2 (no analysis and no consideration of reducibility of $A$)
First symmetrize the Perron vector for $S_P$ using diagonal matrices, i.e. $S_P^T$ has a (unique up to re-scaling) Perron vector $\mathbf w$ and define $\mathbf v_1:= \mathbf w^\frac{1}{2}\cdot \frac{1}{\big \Vert \mathbf w^\frac{1}{2}\big \Vert_2}$
(where the square root is understood to be taken component-wise)

$\Gamma:= \text{diag}\big(\mathbf v_1\big)$
$C_A:= \Gamma S_A \Gamma^{-1}$
$C_P:= \Gamma S_p \Gamma^{-1}$
checking that $C_p\mathbf v_1 = \Gamma S_p\mathbf 1 = \Gamma\mathbf 1=\mathbf v_1$ and $\mathbf v_1^T C_p =\mathbf w^TS_p\Gamma^{-1} = \mathbf w^T\Gamma^{-1}=\mathbf v_1^T$

left (and right) multiplication by a positive diagonal matrix preserves point-wise inequalities between non-negative matrices $\implies C_A \leq C_P$
with the inequality strict in at least one component, and $C_P$ is a positive matrix with a single dominant singular value $\sigma_1 = 1=\lambda_1$ the Perron root, with Perron vector $\mathbf v_1$ --to check: apply Perron theory to $\big(C_P^T C_P\big)\mathbf v_1=C_P^T\mathbf v_1=\mathbf v_1$. Finally

$\lambda_{\max \text{modulus}}\big(C_A\big) \leq \Big \Vert C_A\Big \Vert_2 = \max_{\Vert\mathbf x\Vert_2=1} \mathbf x^TC_A^TC_A\mathbf x\leq \max_{\Vert\mathbf x\Vert_2=1} \mathbf x^TC_P^TC_P\mathbf x\leq \Big \Vert C_P\Big \Vert_2 =1 $
observing that by Perron Frobenius Theory the maximal eigenvalue for $\big(C_A^TC_A\big)$ is real non-negative and has a real non-negative associated eigenvector $\mathbf x$, but non-negativity of $\mathbf x$ implies the last inequality is met with equality iff $\mathbf x=\mathbf v_1$ i.e. iff $\mathbf x$ is a positive vector but that implies the prior inequality is strict, i.e. it would imply $\lambda_{\max \text{modulus}} \leq \max_{\Vert\mathbf x\Vert_2=1} \mathbf x^TC_A^TC_A\mathbf x=\mathbf v_1^TC_A^TC_A\mathbf v_1\lt \mathbf v_1^TC_P^TC_P\mathbf v_1 =1$. Thus the spectral radius of $S_A$ is $\lt 1$.

2
On

I think I figured the solution:
Assume $P(i,k) > A(i,k)$ for some $i,k \in \mathcal{X}$ where $\mathcal{X}$ is the states space so $P$ and $A$ are of size $|\mathcal{X}| \times |\mathcal{X}|$. Also assume by contradiction $Av = \lambda v$ for some $|\lambda| \ge 1$.
Then we have:
$\begin{align*}\\ |Av(i)| = |\sum_{j \in \mathcal{X}} A(i,j)v(j)| \le \sum_{j \in \mathcal{X}} A(i,j)|v(j)| \le \sum_{j \in \mathcal{X}} A(i,j)\max\limits_{l \in \mathcal{X}} |v(l)|\\ < \max\limits_{l \in \mathcal{X}} |v(l)| \end{align*}$
In addition:
 
$|Av(i)| = |\lambda| |v(i)|$

Hence: $|v(i)| < \max\limits_{l \in \mathcal{X}} |v(l)|$
Meaning not all the vector elements are the same (in absolute value). Now assume P is aperiodic (else use @user8675309 proposition) then there exist time t such that all elements in $P^t$ are strictly positive and the original element - wise inequality still holds.
Denote $A' = A^t$, $P' = P^t$ and let $s \in \mathcal{X}$

$\begin{align*}\\ |A'v(s)| = |\sum_{j \in \mathcal{X}} A'(s,j)v(j)| \le \sum_{j \in \mathcal{X}} A'(s,j)|v(j)| \le \sum_{j \in \mathcal{X}} P'(s,j) |v(j)| < \max\limits_{l \in \mathcal{X}} |v(l)| \end{align*}$

The last inequality is due to the facts that $P^t$'s elements are strictly positive and as we shown earlier there is an element which is smaller in absolute value than the max. The previous is true for all states and in particular for state $s'$ achieving the max in absolute value:
$|A'v(s')| = |\lambda ^t| |v(s')| < |v(s')|$

The previous is obviously a contradiction and we conclude our proof.