Proof verification for differentiation through complex SVD

Question

Proof verification for differentiation through complex SVD

90 Views Asked by Bumbble Comm At 28 Mar 2026 - 3:25

I have been working on this particular problem which comes up in research and it would be great if I could have the following proof verified.

Define a zero-mean complex-valued matrix $\mathbf{X}\in\mathbb{C}^{D\times K}$ and another matrix $\mathbf{\Phi}(\epsilon)\in\mathbb{C}^{D\times K}$, the latter depending on a vector $\epsilon \in \mathbb{R}^{D\times1}$ and the covariance matrix $$\mathbf{C}=\frac{1}{K}(\mathbf{X}\circ\mathbf{\Phi})(\mathbf{X}\circ\mathbf{\Phi})^H$$ which is Hermitian. $\circ$ is the Hadamard product operator. $^H$ is the Hermitian (conjugate) transpose operator, $^\ast$ is the complex conjugate, and $^T$ is the usual transpose.

Let $$\mathbf{U}\mathbf{\Sigma}\mathbf{V}^H=\mathbf{C}$$ be the SVD of $\mathbf{C}$. Further, since $\mathbf{C}$ is Hermitian, $\mathbf{U}=\mathbf{V}$, thus $$\mathbf{U}\mathbf{\Sigma}\mathbf{U}^H=\mathbf{C}$$

We now seek to find the derivative of $$\mathbf{W}=\mathbf{U}\mathbf{\Sigma}^{-1/2}\mathbf{U}^H$$ with respect to $\epsilon$.

We begin with \begin{align} \renewcommand{\d}{\partial} \d\mathbf{C} &= \d\mathbf{U}\mathbf{\Sigma}\mathbf{U}^H + \mathbf{U}\d\mathbf{\Sigma}\mathbf{U}^H + \mathbf{U}\mathbf{\Sigma}\d\mathbf{U}^H\\ &= \mathbf{U}\d\mathbf{\Sigma}\mathbf{U}^H + \d\mathbf{U}\mathbf{\Sigma}\mathbf{U}^H + \left(\d\mathbf{U}\mathbf{\Sigma}\mathbf{U}^H\right)^H \end{align}

Letting $\d\mathbf{P} = U^H\d\mathbf{U}$ thus $\d\mathbf{P}^H = \d\mathbf{U}^HU$, \begin{align} \d\mathbf{C} &= \mathbf{U}\d\mathbf{\Sigma}\mathbf{U}^H + \mathbf{U}\d\mathbf{P}\mathbf{\Sigma}\mathbf{U}^H + \mathbf{U}\mathbf{\Sigma}\d\mathbf{P}^H\mathbf{U}^H\\ \mathbf{U}^H\d\mathbf{C}\mathbf{U} &= \d\mathbf{P}\mathbf{\Sigma} + \d\mathbf{\Sigma} + \mathbf{\Sigma}\d\mathbf{P}^H \end{align}

Morever, we have \begin{align} \mathbf{U}\mathbf{U}^H =\mathbf{U}^H\mathbf{U} &= \mathbf{I} \end{align} thus \begin{align} \mathbf{U}^H\d\mathbf{U} + \d\mathbf{U}^H\mathbf{U} &= \mathbf{0}_D\\ \d\mathbf{P} + \d\mathbf{P}^H &= \mathbf{0}_D \end{align} that is, $\d\mathbf{P}$ is anti-Hermitian and have purely imaginary diagonals.

From earlier, we define auxiliary variables \begin{align} \d\mathbf{G} &=\mathbf{U}^H\d\mathbf{C}\mathbf{U}\\ \d\mathbf{G}^H &=\mathbf{U}^H\d\mathbf{C}\mathbf{U}\\ %\mathbf{H}\circ\d\mathbf{G} + \Im\left(\mathbf{I}\circ\d\mathbf{G}\right)\\ \d\mathbf{H} &= \d\mathbf{G}-\d\mathbf{\Sigma}\\ &= \d\mathbf{P}\mathbf{\Sigma} + \mathbf{\Sigma}\d\mathbf{P}^H\\ &= \d\mathbf{P}\mathbf{\Sigma} - \mathbf{\Sigma}\d\mathbf{P}\\ &= \d\mathbf{H}^H \end{align} Hence $\d\mathbf{G}$ and $\d\mathbf{H}$ are also Hermitian. Hence, it follows that the diagonal entries of $\d\mathbf{G}$ are real and the diagonal entries of $\d\mathbf{H}$ are zero.

Now \begin{align} \d\mathbf{H}\mathbf{\Sigma} &= \d\mathbf{P}\mathbf{\Sigma}^2 -\mathbf{\Sigma}\d\mathbf{P}\mathbf{\Sigma}\\ \mathbf{\Sigma}\d\mathbf{H} &= \mathbf{\Sigma}\d\mathbf{P}\mathbf{\Sigma} -\mathbf{\Sigma}^2\d\mathbf{P}\ \end{align}

Considering element-wise \begin{align} \d h_{ij}\sigma_j + \sigma_i\d h_{ij} &= \d{p}_{ij}\sigma_j^2 - \sigma_i^2\d{p}_{ij}\\ (\sigma_i + \sigma_j)\d h_{ij} &= (\sigma_j^2 - \sigma_i^2)\d{p}_{ij}\\ \d{p}_{ij} &= \dfrac{\sigma_i + \sigma_j}{\sigma_j^2 - \sigma_i^2}\d h_{ij}\\ &= \dfrac{\d h_{ij}}{\sigma_j - \sigma_i} \end{align} for $i\ne j$, leaving the diagonal of $\d\mathbf{P}$ unknown.

Then I have \begin{align} \mathbf{W} &= \mathbf{U}\mathbf{\Sigma}^{-1/2}\mathbf{U}^H\\ \d\mathbf{W} &= \d\mathbf{U}\mathbf{\Sigma}^{-1/2}\mathbf{U}^H + \mathbf{U}\d\mathbf{\Sigma}^{-1/2}\mathbf{U}^H + \mathbf{U}\mathbf{\Sigma}^{-1/2}\d\mathbf{U}^H\\ &= \mathbf{U}\d\mathbf{P}\mathbf{\Sigma}^{-1/2}\mathbf{U}^H + \mathbf{U}\d\mathbf{\Sigma}^{-1/2}\mathbf{U}^H + \mathbf{U}\mathbf{\Sigma}^{-1/2}\d\mathbf{P}^H\mathbf{U}^H\\ \mathbf{U}^H\d\mathbf{W}\mathbf{U} &= \d\mathbf{P}\mathbf{\Sigma}^{-1/2} + \d\mathbf{\Sigma}^{-1/2} + \mathbf{\Sigma}^{-1/2}\d\mathbf{P}^H\\ \d\mathbf{W} &= \mathbf{U}\left[\d\mathbf{P}\mathbf{\Sigma}^{-1/2} + (\d\mathbf{P}\mathbf{\Sigma}^{-1/2})^H + \d\mathbf{\Sigma}^{-1/2}\right]\mathbf{U}^H\\ &= \mathbf{U}\left[\d\mathbf{P}\mathbf{\Sigma}^{-1/2} - \mathbf{\Sigma}^{-1/2}\d\mathbf{P} - \dfrac{1}{2}\mathbf{\Sigma}^{-3/2}(\d\mathbf{\Sigma})\right]\mathbf{U}^H \end{align}

Note also that \begin{align} \left[\d\mathbf{P}\mathbf{\Sigma}^{-1/2} -\mathbf{\Sigma}^{-1/2}\d\mathbf{P}\right]_{ij} &= \d{p}_{ij}\sigma_j^{-1/2} - \sigma_i^{-1/2}\d{p}_{ij} \end{align} On the diagonal \begin{align} \left[\d\mathbf{P}\mathbf{\Sigma}^{-1/2} -\mathbf{\Sigma}^{-1/2}\d\mathbf{P}\right]_{ii} &= \d{p}_{ii}\sigma_j^{-1/2} - \sigma_i^{-1/2}\d{p}_{ii}\\ &= 0 \end{align} and elsewhere \begin{align} \left[\d\mathbf{P}\mathbf{\Sigma}^{-1/2} -\mathbf{\Sigma}^{-1/2}\d\mathbf{P}\right]_{ij} &= \d{p}_{ij}\sigma_j^{-1/2} - \sigma_i^{-1/2}\d{p}_{ij}\\ &= \dfrac{\sigma_j^{-1/2}- \sigma_i^{-1/2}}{\sigma_ j- \sigma_i}\d h_{ij} \end{align}

Hence, \begin{align} \d\mathbf{P}\mathbf{\Sigma}^{-1/2} -\mathbf{\Sigma}^{-1/2}\d\mathbf{P} &= \mathbf{M}\circ\d\mathbf{H}\\ &= \mathbf{M}\circ(\d\mathbf{G} - \d\mathbf{\Sigma})\\ &= \mathbf{M}\circ\d\mathbf{G}\\ &= \mathbf{M}\circ(\mathbf{U}^H\d\mathbf{C}\mathbf{U}) \end{align} where \begin{align} m_{ij} &= \begin{cases} \dfrac{\sigma_j^{-1/2}- \sigma_i^{-1/2}}{\sigma_j - \sigma_i} & i\ne j\\ 0 & i = j \end{cases} \end{align}

In all, \begin{align} \d\mathbf{W} &= \mathbf{U}\left[\mathbf{M}\circ\d\mathbf{G} -\frac{1}{2}\mathbf{\Sigma}^{-3/2}\d\mathbf{\Sigma}\right]\mathbf{U}^H \end{align}

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Define some new variables for later convenience. $$\eqalign{ &\lambda^2 = K,\quad B = X\circ\Phi \\ &x = {\rm vec}(X),\quad \phi={\rm vec}(\Phi),\quad G={\rm Diag}(x) \\ &b={\rm vec}(B)=x\circ\phi = G\phi,\quad J=\frac{\partial \phi}{\partial \varepsilon} \\ &C = \lambda^{-2}BB^H = U\Sigma U^H \\ }$$ Any analytic function of $C$ can be written in terms of the SVD and vice versa $$\eqalign{ f(C) &= U\,f(\Sigma)\,U^H \\ W &= U\bigg(\frac{1}{\sqrt{\Sigma}}\bigg)U^H = C^{-1/2}\\ W^2 &= C^{-1} = \lambda^2(BB^H)^{-1} }$$ Calculate the differential and vectorize it (note that $C$ and $W$ are both hermitian). $$\eqalign{ W\,dW + dW\,W &= -\lambda^2(BB^H)^{-1}\Big(B\,dB^H+dB\,B^H\Big)(BB^H)^{-1} \\ -\lambda^{-2}BB^H\big(W\,dW + dW\,W\big)BB^H &= B\,dB^H+dB\,B^H \\ -\lambda^{2}\big(CW\,dW\,C + C\,dW\,WC\big) &= B\,dB^H+dB\,B^H \\ -K\big(C^T\otimes CW + C^TW^T\otimes C\big)\,dw &= (I\otimes B)L\,db^* + (B^*\otimes I)\,db \\ -K\big(C^*\otimes CW + C^*W^*\otimes C\big)\,dw &= (I\otimes B)LG^*\,d\phi^* + (B^*\otimes I)G\,d\phi \\ }$$ where $L$ is the commutator matrix associated with vectorization, i.e. $$ {\rm vec}(X^T)=L\;{\rm vec}(X) \\ {\rm vec}(X^H)=L\;{\rm vec}(X^*)$$ Compact the differential by introducing names for the Kronecker matrices.
Then find the gradient with respect to the $\varepsilon$-vector. $$\eqalign{ P\,dw &= Q\,d\phi + R\,d\phi^* \\ dw &= M\,d\phi + N\,d\phi^* \\ \frac{\partial w}{\partial \varepsilon} &= MJ + NJ^* \\ }$$

Proof verification for differentiation through complex SVD

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in PROOF-VERIFICATION

Related Questions in COMPLEX-NUMBERS

Related Questions in MATRIX-CALCULUS

Related Questions in MATRIX-DECOMPOSITION

Trending Questions

Popular # Hahtags

Popular Questions