Solving equation of matrices

125 Views Asked by At

My problem is maybe extremely simple. I have the following Equation that I have to solve with respect to $B$:

$\sum_{t=2}^T (X_t - B X_{t-1} B^\top) = 0$,

where $X_t$ and $X_{t-1}$ are square and symmetric $n \times n$ matrices, while $B$ is a $n \times n$ square matrix, $B^\top$ its transpose and the $T$ in the summation signs is the number of temporal observations.

I get that I can write it as:

$\sum_{t=2}^T X_t = \sum_{t=2}^T B X_{t-1} B^\top$

but how can I solve it for $B$? I thought that using the trace operator would help but then I would need to invert the trace to a matrix which is more complicated.

1

There are 1 best solutions below

11
On BEST ANSWER

We have the equation $$ \sum_{t=2}^T X_t = B \left(\sum_{t=1}^{T-1} X_t\right) B^\top. $$ Since $X_t$ are all known, lets replace the sums with some matrices $Y$ and $Z$: $$ Y = B Z B^\top. $$ Note that both $Y$ and $Z$ are symmetric and positive definite, so we can represent them using Cholesky decomposition as $$ Y = LL^\top\\ Z = MM^\top. $$ The equation becomes $$ LL^\top = BM M^\top B^\top = BM (BM)^\top. $$ We may insert $QQ^\top$ where $Q$ is an arbitrary orthogonal matrix between $L$ and $L^\top$: $$ LQQ^\top L^\top = BM (BM)^\top. $$ Every $B$ that satisfies $$ LQ = BM \implies B = LQM^{-1} $$ would be a solution. Let's prove that there's no other solutions. Multiplying both sides of $LL^\top = BMM^\top B^\top$ with $L^{-1}$ and $L^{-\top}$ gives $$ I = L^{-1} B M M^\top B^\top L^{-\top} = L^{-1} B M (L^{-1} B M)^\top $$ and this means that $L^{-1} B M$ is an orthogonal matrix: $$ Q = L^{-1} B M \implies B = LQM^{-1}. $$

Updated Since the solution is not unique, we have to select one solution based on some assumptions about it. Let's find one that makes $LQ$ as close to $M$ as possible: $$ \text{minimize } \|LQ - M\|_F^2\\ \text{subject to } Q^T Q = I $$ This is a variant of Procrustes problem $$ Q = \arg \min \|LQ - M\|_F^2 = \\ = \arg \min \|LQ\|_F^2 + \|M\|_F^2 - 2 \langle LQ, M \rangle_F = \\ = \arg \min \|L\|_F^2 + \|M\|_F^2 - 2 \langle LQ, M \rangle_F = \\ = \arg \max \langle LQ, M \rangle_F = \\ = \arg \max \langle Q, L^\top M \rangle_F = \\ = \arg \max \langle Q, U\Sigma V^\top \rangle_F = \\ = \arg \max \langle U^\top Q V, \Sigma \rangle_F $$ The matrix $U^\top Q V$ is an orthogonal matrix and the maximum is attained when $U^\top Q V = I$, so $Q = UV^\top$ is the solution. The $U$, $\Sigma$ and $V^\top$ are elements of SVD decomposition of $L^\top M$ product.