Solve a linear equation XA=B using matrix with zero diagonal

118 Views Asked by At

Consider the following problem: given $A,B\in\mathbb{R}^{N\times P}$ for $N>P$, we wish to find $X\in\mathbb{R}^{N\times N}$ such that:
$$XA=B$$

Without further constraint, the solution is given by the Moore-Penrose pseudoinverse: $$X=BA^+=B(A'A)^{-1}A'$$

but I'm interested in a solution $X$ with zero diagonal, i.e., $X_{ii}=0\ \forall i\in[N]$. This is still a linear equation so it can be solved numerically, but is there a way to write the solution $X$ "in closed form"?

EDITED: fixed the wrong ("transposed") definition mentioned in some of the answers below

3

There are 3 best solutions below

0
On

OK, there's a nice answer, which stems from a Lagrangian perspective on the source of the Moore-Penrose pseudoinverse solution for the unconstrained case.

The unconstrained diagonal case can be written as solving the following optimization problem: $${\cal L}=\frac{1}{2}\|X\|^2_F-Y\circ(XA-B)$$ where $Y\in\mathbb{R}^{N\times P}$ are Lagrange multipliers and $\circ$ denotes element-wise multiplication. Taking the derivative $\partial {\cal L}/\partial X$ and equating to $0$ this is solved for $X=YA'$ so that from $XA=B$ we have $Y=B(A'A)^{-1}$ and $X=BA^+$.

Moving over to the constrained diagonal case, now it can be written as solving the following optimization problem: $${\cal L}=\frac{1}{2}\|X\|^2_F-Y\circ(XA-B)-y\cdot diag(X)$$ where $Y\in\mathbb{R}^{N\times P}$ and $y\in\mathbb{R}^{N}$ are Lagrange multipliers. Taking the derivative $\partial {\cal L}/\partial X$ and equating to $0$ this is solved for $X=YA'+\Delta(y)$, where $\Delta(y)$ is a diagonal matrix with $y$ as the diagonal, so that from $XA=B$ we have $Y=(B-\Delta(y)A)(A'A)^{-1}$ and finally: $$X=BA^++\Delta(y)(I-AA^+)$$

Now the self-consistent solution for $y$ would be: $$y=-diag(BA^+)/diag(I-AA^+)$$

0
On

$ \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\diag#1{\op{diag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $Let $\{e_k\}$ denote the standard basis vectors, replace $\LR{A,B,X}$ by their transposes, and define a $zero$-$one$ matrix $(F)$ with zeros on the diagonal and ones elsewhere.

It will also be convenient to introduce an unconstrained matrix $(U)$ and define the following variables $$\eqalign{ X &= F\odot U &\qiq dX = F\odot dU \\ H &= A^TB,&\quad\quad a_k = Ae_k \\ b_k &= Be_k,&\quad\quad x_k = Xe_k \\ f_k &= Fe_k,&\quad\quad h_k = He_k = A^Tb_k \\ F_k &= \Diag{f_k} &\qiq F_kv = f_k\odot v \\ }$$ and the Frobenius product $(\,:\,)$ with the following properties $$\eqalign{ A:B &= \sum_i\sum_j A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \}\\ A:B &= B:A \;=\; B^T:A^T \\ A:\LR{BC} &= \LR{AC^T}:B \;=\; \LR{B^TA}:C \\ A:\LR{B\odot C} &= \LR{A\odot B}:C \;=\; \sum_i\sum_j A_{ij}B_{ij}C_{ij} \\ }$$ Write the equation as an unconstrained least-squares problem $$\eqalign{ \phi &= \tfrac12\frob{AX-B}^2 \\ &= \tfrac12\LR{AX-B}:\LR{AX-B} \\ d\phi &= \LR{AX-B}:\LR{A\;dX} \\ &= \LR{A^TAX-A^TB}:{F\odot dU} \\ &= F\odot\LR{A^TAX-H}:dU \\ \grad{\phi}{U} &= F\odot\LR{A^TAX-H} \\ }$$ Solve the zero-gradient condition by utilizing the unique property of the $e_k$ vectors, which distribute across Hadamard products ! $$\eqalign{ F\odot\LR{A^TAX} &= F\odot H \\ \LR{Fe_k}\odot\LR{A^TAXe_k} &= \LR{Fe_k}\odot\LR{He_k} \\ F_k \LR{A^TAx_k} &= F_k h_k \\ x_k &= \LR{F_kA^TA}^{\boldsymbol+}F_k h_k \\ &= \LR{F_kA^TA}^{\boldsymbol+}F_k A^Tb_k \\ }$$ Thus you can solve for the optimal $X$ matrix one column at a time. The pseudoinverse is used to solve the columnar equations, since the matrix in parentheses is singular.

You can rework the whole problem in terms of the original transposed variables, but the result has transpose operations everywhere and looks much uglier.

0
On

Denote by $x_{i\ast}$ and $b_{i\ast}$ the $i$-th rows of $X$ and $B$ respectively. Let $\widehat{x}_{i\ast}$ be the subvector obtained by removing the $i$-th element from $x_{i\ast}\in\mathbb R^{1\times(N-1)}$ and $\widehat{A}_i\in\mathbb R^{(N-1)\times N}$ be the submatrix obtained by removing the $i$-th row from $A$. The problem $$ XA=B,\quad x_{11}=x_{22}=\cdots=x_{NN}=0\tag{1} $$ is then equivalent to the system of linear equations $x_{i\ast}A=b_{i\ast}$ and $x_{ii}=0$ for $i=1,2,\ldots,N$. In turn, it is equivalent to the system of linear equations $$ \begin{cases} \widehat{x}_{i\ast}\widehat{A}_i=b_{i\ast}\quad\text{for $i=1,2,\ldots, N$,}\\ x_{11}=x_{22}=\cdots=x_{NN}=0. \end{cases}\tag{2} $$ Therefore, when it is solvable, one solution (among possibly infinitely many solutions) is given by $$ \begin{cases} \widehat{x}_{i\ast}=b_{i\ast} \widehat{A}_i^+\quad\text{for $i=1,2,\ldots, N$,}\\ x_{11}=x_{22}=\cdots=x_{NN}=0, \end{cases} $$ and the systems of equation $(1)$ and $(2)$ are solvable if and only if $$ b_{i\ast} \widehat{A}_i^+\widehat{A}_i=b_{i\ast}\quad\text{for each $i$.} $$