Expected Matrix Product

166 Views Asked by At

This question has arisen from a question about a Markov chain where the transition matrix $T$ is random but structured. The idea being that the transition is composed of two parts, transition $A$ and $B$ which occur in known quantities $n$ and $m$ respectively but unknown order. This means that $T$ is the product of $n$ matrices $A$ and $m$ matrices $B$ in some ordering.

For example when $m=1$ and $n=2$

$$T \in \{ABB, BAB, BBA\}$$

The question is, when each permutation $\begin{pmatrix} m+n \\ m\end{pmatrix}$ of these $n+m$ matrices is equally likely what is the expected transition matrix $T$?

Interestingly this is like considering necklaces with fixed content where the colors are matrices. I say this because in the sense that I'm interested in the behavior of $T$ as it is applied to itself, if this problem could be solved invariant to rotations of the ordering of $A,B$ that would also be helpful.

My thinking:

My initial hope was that $$E[T] = \left(\frac{n}{n+m}A +\frac{m}{n+m} B\right)^{n+m}$$

But that does not seem to be the case by numerical experiments for small $n,m$ where the expected value can be calculated directed by summing over every permutation.

My second hope was to be able to "factor out" the matrices $A,B$ in the expected value expression. By letting $D = diag(rep(A,n),rep(B,m))$ the (n+m)*size(A) block-diagonal matrix we can consider constructing one order from some permutation $\Pi_1$.

\begin{align*} T(\Pi_1) &= \prod_i^{n+m} (e_i^T \otimes I) (\Pi_1\otimes I) D (\Pi_1\otimes I)^T (e_i \otimes I)\\ &= \prod_i^{n+m} (e_i^T \Pi_1\otimes I)D (\Pi_1^T e_i \otimes I) \end{align*}

each term of this product permutes the diagonal blocks of $D$ is the same way, then $(e_i^T \otimes I) \cdots (e_i \otimes I) $ selects off the $i$th block. This way each of the $n+m$ blocks of $A,B$ occur in some order determined by $\Pi_1$.

My hope from this formulation was that in this form each $\Pi$ would factor out in the expected value of $T$: $$E[T] = \frac{1}{(\begin{pmatrix} m+n \\ m\end{pmatrix}} \sum_{\Pi} T(\Pi)$$ by using vectorizations but it doesn't seem to work. Notably for each term in the product of $T(\Pi)$ we have

$$vec\left((e_i^T \Pi_1\otimes I)D (\Pi_1^T e_i \otimes I)\right) = (\Pi_1 e_i^T \otimes I)\otimes(e_i^T \Pi_1\otimes I) vec(D) $$

The problem here is that despite the data matrix $D$ factoring out nicely from one term it does not help us simplify the product as far as I can tell.

Thank you for any of your thoughts!

1

There are 1 best solutions below

0
On BEST ANSWER

Okay, I'm compiling information from @user125932.

WLOG let $m>n$, then let $P(x) = (A+xB)^{n+m}$. Note that the sum of the $\begin{pmatrix} m+n \\ m\end{pmatrix}$ terms of $P$ which are degree $m$ give $$ \sum_{\Pi} T(\Pi)$$.

Then for this polynomial $P(x) = \sum_k c_k x^k$ we are interested in $c_m$ as $E[T] = c_m/ \begin{pmatrix} m+n \\ m\end{pmatrix}$.

Theorem 1 here (or see here) describes one method of obtaining this coefficient through the Root of Unity Filter. For $\omega = e^{2\pi i /m}$ we have

$$c_0 + c_m =\frac{1}{m} \sum_{k=0}^{\infty} c_{k\cdot m} = \frac{1}{m} \sum_{k=0}^{m-1} P(\omega^k)$$

where further terms do not exist because $2m > m+n = deg(P)$, (this is by assumption $m>n$). Thus

$$c_m = \frac{1}{m} \sum_{k=0}^{m-1} P(\omega^k) - c_0 = \frac{1}{m} \sum_{k=0}^{m-1} P(\omega^k) - A^{m+n} $$

Then finally

$E[T] = \frac{1}{\begin{pmatrix} m+n \\ m\end{pmatrix}} \left(\left(\frac{1}{m}\sum_{k=0}^{m-1} P(\omega^k) \right)- A^{m+n} \right)$