Fastest mixing Markov chain on a graph — derivation of optimal conditions

105 Views Asked by At

I am currently reading a classic paper, Fastest Mixing Markov Chain on a Graph, by Boyd & Diaconis & Xiao. I have problem understanding how the optimal conditions can be derived in Section 4.2.

The paper first formulates the original problem into a semidefinite progamming problem. Note $A \preceq B$ means $B-A$ is positive semidefinite.

\begin{equation} \begin{aligned} \min \quad & s\\ \textrm{s.t.} \quad & -SI \preceq P-\frac{1}{n}\textbf{1}\textbf{1}^{T} \preceq SI \\ & P\geq 0, \hspace{0.1cm} P\textbf{1}=\textbf{1},\hspace{0.1cm} P=P^{T} \\ & P_{i,j}=0 , \hspace{0.1cm} (i,j) \not \in E \end{aligned} \end{equation}

Then it derives its dual problem: \begin{equation} \begin{aligned} \max \quad & 1^{T}z\\ \textrm{s.t.} \quad & Y\textbf{1}=0,\sum_{i=1}^{n} |\lambda_i(Y)| \leq 1\\ & \frac{z_i+z_j}{2} \leq Y_{i,j} \end{aligned} \end{equation} After proving strong duality, in section 4.2, the paper gives the optimal conditions (KKT) but omits the derivation. I understand the part of primal feasibility, dual feasibility, and the first equation in complementary slackness must be true, but I don't understand how the other equations in the complementary slackness section can be derived: $$Y^{*}=Y_{+}^{*}-Y_{-}^{*}, \hspace{0.1cm} 0 \preceq Y_{+}^{*} , \hspace{0.1cm} 0 \preceq Y_{-}^{*}$$ $$P^{*}Y_{+}^{*}=\mu(P^{*})Y_{+}^{*}, \hspace{0.1cm} P^{*}Y_{-}^{*}=-\mu(P^{*})Y_{-}^{*}, \hspace{0.1cm} Tr(Y_{+}^{*})+Tr(Y_{-}^{*})=1$$

Here $\mu(P^{*})$ refers to the second largest eigenvalue modulus of $P^{*}$, the optimal transition matrix. $Y_{+}^{*}$ and $Y_{-}^{*}$ are the positive semidefinite and negative semidefinite parts of $Y^{*}$, which I don't really understand. The paper says the derivation is similar to the ideas contained in this paper: https://cs.nyu.edu/media/publications/TR1991-566.pdf, but I'm not sure which part of the paper to focus on, and there doesn't seem to be a discussion regarding the positive/negative semidefinite part of a matrix. Please let me know if you know how the optimal conditions might be derived. Thank you!