Gradient Of Complex Least Squares

606 Views Asked by At

I want to find the gradient of $f : \mathbb{C}^{N} \rightarrow \mathbb{R}$, where

$$ f(\mathbf{x}) = \frac 1 2 \Vert \mathbf{Ax} - \mathbf{b} \Vert_2^2, $$

and $\mathbf{A}\in\mathbb{C}^{M \times N}$, $\mathbf{x}\in\mathbb{C}^{N}$, $\mathbf{b}\in\mathbb{C}^{M}$.

For real matrices, it's straightforward, but I am getting stuck with the complex case:

$$ f(\mathbf{x}) = \frac 1 2 ( \mathbf{Ax} - \mathbf{b} )^\mathrm{H} ( \mathbf{Ax} - \mathbf{b} ) = \frac 1 2 ( \mathbf{x}^\mathrm{H} \mathbf{A}^\mathrm{H} \mathbf{Ax} - \mathbf{x}^\mathrm{H} \mathbf{A}^\mathrm{H} \mathbf{b} - \mathbf{b}^\mathrm{H} \mathbf{Ax} + \mathbf{b}^\mathrm{H} \mathbf{b} ). $$

For the first and thirds terms, the gradient is $2 \mathbf{A}^\mathrm{H} \mathbf{Ax}$ and $\mathbf{b}^\mathrm{H} \mathbf{A}$, respectively. For the fourth term it is $\mathbf{0}$. However, it's the gradient of the second term that has me puzzled and I am wondering if there is something fundamental that I have forgotten. How do I account for the conjugation of $\mathbf{x}$ in the gradient? Is that even possible?

Any help is greatly appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

For a matrix $M$, let me denote its complex conjugate by $M^*$, its transpose by $M^T$ and its hermitian conjugate by $M^H=M^{*T}$.

Write the function in terms of the Frobenius (:) Inner Product and take its differential $$\eqalign{ f &= \frac{1}{2}(Ax-b)^*:(Ax-b) \cr\cr df &= \frac{1}{2}\Big(A^*\,dx^*:(Ax-b) + (Ax-b)^*:A\,dx\Big) \cr &= \frac{1}{2}\Big(dx^*:A^H(Ax-b) + A^T(Ax-b)^*:dx\Big) \cr\cr }$$ Now the tricky part is to treat $dx^*$ and $dx$ as independent variables.

To find the gradient wrt $x$ you hold $x^*$ constant, i.e. set $\,dx^*=0$, to obtain $$\eqalign{ \frac{\partial f}{\partial x} &= \frac{1}{2}A^T(Ax-b)^* \cr\cr }$$ While holding $x$ constant yields $$\eqalign{ \frac{\partial f}{\partial x^*} &= \frac{1}{2}A^H(Ax-b) \cr\cr }$$ Notice that the two expressions are complex conjugates of one another, so you only need calculate one of them.

0
On

Let $g(x,\bar{x})=(Ax-b)^*(Ax-b)=(\bar{x}^TA^*-b^*)(Ax-b)$.

$\dfrac{\partial g}{\partial \bar{x}}:k\rightarrow k^TA^*(Ax-b)$; then, for the symmetric bilinear form $<X,Y>=X^TY$, $\nabla_{\bar{x}}g=A^*(Ax-b)$.

Since $g$ is a real function, $\nabla_{x}g=\overline{\nabla_{\bar{x}}g}=A^T\overline{(Ax-b)}$.