Derivative of the Frobenius norm of a pseudoinverse matrix

1k Views Asked by At

Given a wide (full row-rank) complex matrix $\mathbf{A} \in \mathbb{C}^{m \times n}$, where $m<n$, and its pseudoinverse $\mathbf{A}^+ \in \mathbb{C}^{n \times m}$, how can I calculate the following derivative:

$\frac{\partial}{\partial \mathbf{A}} \left( \left\Vert \mathbf{B} \mathbf{A}^+ \right\Vert_F^2 \right) $,

where $\mathbf{B} \in \mathbb{C}^{p \times n}$ is known, and $\Vert \cdot \Vert_F$ is the Frobenius norm?

1

There are 1 best solutions below

0
On BEST ANSWER

For ease of typing, use $X$ to denote the pseudoinverse of $A$.

The tricky part is knowing that the differential of the pseudoinverse is $$\eqalign{ dX &= (I-XA)\,dA^T\,X^TX + XX^T\,dA^T\,(I-AX) - X\,dA\,X \cr }$$ Now write the function in terms of the Frobenius (:) product, take its differential, and substitute the above differential $$\eqalign{ f &= \|BX\|^2_F\, = BX:BX \cr\cr df &= 2BX:B\,dX \cr &= 2B^TBX:dX \cr &= 2B^TBX:(I-XA)\,dA^T\,X^TX + 2B^TBX:XX^T\,dA^T\,(I-AX) - 2B^TBX:X\,dA\,X \cr &= 2(I-XA)^TB^TBXX^TX:dA^T +2XX^TB^TBX(I-AX)^T:dA^T -2X^TB^TBXX^T:dA\cr &= 2\Big(X^TXX^TB^TB(I-XA) +(I-AX)X^TB^TBXX^T -X^TB^TBXX^T\Big):dA\cr }$$ Since $\,df=\Big(\frac{\partial f}{\partial A}:dA\Big),\,$ the gradient is $$\eqalign{ \frac{\partial f}{\partial A} &= 2\Big(X^TXX^TB^TB(I-XA) +(I-AX)X^TB^TBXX^T -X^TB^TBXX^T\Big) \cr }$$