Using a pseudo-inverse for the solution/least-squares of $X$ in $AXA^T=B$

348 Views Asked by At

For a matrix $A \in \mathbb{R}^{m\times n}$ and its pseudo-inverse $A^+$, the vector that minimizes the least squared error of

$$Ax = b$$

is $z = A^+b$, and it's the unique least-squares solution if $A$ has full column-rank.

Can we make an equivalent statement for finding the least squares solution of

$$A X A^T = B$$

where $X \in \mathbb{R}^{n\times n}$ and $B\in \mathbb{R}^{m\times m}$? So can we say that $Z = A^+ B A^{+T}$ is the least squares solution to this equation? Or if not least-squares is it the solution by some other metric?

One way to find out is to just differentiate the sum of the squared errors of the above, and see if it's equal to $A^+ B A^{+T}$ but that will be long. I'm trying it, but are there any simpler ways to think about this problem and show the above (if it's true), using properties of pseudo-inverses? I'm having a hard time relating the case of a linear system of equations to this (it looks like a quadratic system of equations but you're solving for the coefficients instead of the roots of a quadratic).

Also, we can assume that $A$ is full column rank and $m > n$.


Part two; if the above is available somewhere, is there also a similar solution in terms of pseudo-inverses for X in $AXB = C$, given $A \in \mathbb{R}^{m\times n}$, $X \in \mathbb{R}^{n \times n}$ and $B \in \mathbb{R}^{n \times p}$

2

There are 2 best solutions below

2
On BEST ANSWER

Theorem: Least Squares in arbitrary Hilbert Spaces

Consider Hilbert spaces $\mathcal U, \mathcal V$, a bounded linear operator $L\colon \mathcal U \to \mathcal V$ and an element $b\in \mathcal R(L)+\mathcal R(L)^\perp$. Then the least squares problem

$$\min_x \|L(x)-b\|^2$$

Has a unique minimal norm solution given by the pseudoinverse $x^* = L^+(b)$

For details, caveats, and infinite dimensional weirdness, check out these slides.


Lemma: $(A\otimes B)^+ = (A^+\otimes B^+)$

Proof: We check the 4 Moore-Penrose conditions. These turn out to be trivially true by the mixed product property, for example we have:

$$\begin{aligned} (A\otimes B)(A^+\otimes B^+)(A\otimes B) = (AA^+A\otimes BB^+B) = (A\otimes B) \end{aligned}$$


With these two tools the conclusion immediately follows, by realizing that $AXA^$ is a linear operator:

$$\begin{aligned} \min_X \|AXA^ - B\|^2 &= \min_X \|(A\otimes A)\cdot X - B\|^2 \\&= (A\otimes A)^+B = (A^+\otimes A^+)B \\&= A^+ B (A^+)^ \end{aligned}$$

0
On

Well, I am answering my own question. The answer is that $Z = A^{+} B A^{+T}$ is indeed the least squares solution for $X$ in $A X A^T = B$, at least where least-squares means minimizing the Frobenius norm of $A X A^T - B$, (i.e., element-wise least squares). The steps are outlined below.

$$\left\| A X A^T - B \right\|_F^2 = \sum_{i=1}^m \sum_{j=1}^m \left( A_{ik} X_{kl} A_{jl} - B_{ij} \right)^2$$

is what we are trying to minimise; with repeated index summation convention (on k and l). The gradients of this are:

$$\frac{\partial}{\partial X_{ab}} \left\| A X A^T - B \right\|_F^2 = \sum_{i=1}^m \sum_{j=1}^m (2 A_{ia} A_{jb} A_{ik} A_{jl} X_{kl} - 2 B_{ij} A_{ia} A_{jb})$$

The terms $\sum_{i=1}^m \sum_{j=1}^m B_{ij} A_{ia} A_{jb}$ are the $a, b$ elements of $A^TBA$. The other term can be written as:

$$\sum_{i=1}^m \sum_{j=1}^m A_{ia} A_{jb} A_{ik} A_{jl} X_{kl} = \left[A^TAXA^TA\right]_{ab}$$

At the minimum point, all the partial derivatives are $0$, so

$$A^TA X A^T A - A^T BA = \mathbf{0}$$

When $A$ is full column rank, $A^TA$ is invertible, and we can solve for $X$:

$$\boxed{X = (A^TA)^{-1} A^T BA (A^T A)^{-1}}$$

which is equal to $A^+ B A^{+T}$ for full column rank $A$. Still to determine what happens when $A$ isn't full column-rank.