Derivative of a quadratic form

2.3k Views Asked by At

There is a Hermitian matrix $X$ and a complex vector $a$. I know that $a^HXa$ is a real scalar but derivative of $a^HXa$ with respect to $a$ is complex, $$\frac{\partial a^HXa}{\partial a}=Xa^*$$ Why is the derivative complex? Is it possible that the derivative of a real variable be complex? (matrix $X$ is complex).

4

There are 4 best solutions below

0
On BEST ANSWER

I'm not sure the question is correct. Testing with xavierm02's suggested example, let $X = \begin{pmatrix} 0 & i \\ -i & 0 \end{pmatrix}$. Then $$ f(a_1, a_2) = \begin{pmatrix} \bar a_1 & \bar a_2 \end{pmatrix}\begin{pmatrix} 0 & i \\ -i & 0 \end{pmatrix} \begin{pmatrix} a_1 \\ a_2 \end{pmatrix} = \bar a_1 i a_2 - \bar a_2 i a_1 $$ So $$ \frac{\partial f}{\partial a_1} = -\bar a_2 i \qquad \frac{\partial f}{a_2} = i\bar a_1 $$ In other words $$ \frac{\partial f}{\partial a} = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}\begin{pmatrix} \bar a_1 \\ \bar a_2 \end{pmatrix} = X^*a^* $$

In general, let $X = (x_{ij})$ be a Hermitian matrix. So $\bar x_{ij} = x_{ji}$. If $a = (a_i)$ is a complex vector, then $$ f(a) = \sum_{j=1}^n (a^H)_i (Xa)_i = \sum_{i,j=1}^n \bar a_i x_{ij} a_j $$ Then for each $k$, $$ \frac{\partial f}{\partial a_k}=\sum_{i,j=1}^n \bar a_i x_{ij} \delta_{jk} = \sum_{i=1}^n \bar a_i x_{ik} = \sum_{i=1}^n \bar{x}_{ki}\bar a_i $$ which again is the coordinate form of $$\frac{\partial f}{\partial a} = X^*a^*$$

3
On

Yes, applying some differential operators to a real-value function, one can get a complex-valued result. This is because the differential operator may happen to have complex coefficients. Let's stick to 1-dimensional case for simplicity. Writing $a=x+iy$, we have $$ \frac{\partial f}{\partial a} = \frac12 \frac{\partial f}{\partial x} -\frac{i}{2}\frac{\partial f}{\partial y} $$ If $f$ is real valued, both partials are real-valued, but the result may well be complex because of that $i$.

For example, when $f(a)=\overline a a = x^2+y^2$, the result of differentiation is $$ \frac{\partial f}{\partial a} = \frac12 (2x) -\frac{i}{2}(2y) = x-iy = \bar a $$ This example fits your situation with $X=(1)$, the one-dimensional identity matrix.

5
On

@ EKH , your derivative is false. There are $3$ mistakes: 1. $Xa^*$ is not defined. 2. When you derive $a^2$, there is a factor $2$... 3. Here the result is real.

Let $\phi:a\rightarrow a^*Xa$. Then $D\phi_a:h\rightarrow h^*Xa+a^*Xh=2Re(a^*Xh)$. Thus $\nabla_{\phi}(a)$ is associated to $2Re(Xa)$. Let $X=Y+iZ$ where $Y,Z$ are real and $Y=Y^T,Z=-Z^T$, $a=b+ic$ where $b,c$ are real. Finally $2Re(Xa)=2((Yb-Zc)+i(Zb+Yc))$. $\phi$ is not a polynomial in the $(a_i)_i$ ; yet it is a polynomial in the $(b_i)_i,(c_i)_i$. Thus if we put $a=\begin{pmatrix}b\\c\end{pmatrix}$, then $\nabla_{\phi}(a)=2\begin{pmatrix}Yb-Zc\\Zb+Yc\end{pmatrix}$.

2
On

One way to easily see the first two derivatives of a vector or matrix functional, particularly of a quadratic form, is to use a variational approach. In this case we have $$f(a+\delta a)=(a+\delta a)^HX(a+\delta a)=a^HXa+(\delta a)^HXa+a^HX(\delta a)+(\delta a)^HX\delta a$$ The linear term gives us the gradient: $$\langle \nabla f(a), \delta a \rangle = (\delta a)^H X a + a^HX (\delta a) = 2 Re(a^HX\delta a) \quad\Longrightarrow\quad \nabla f(a)=2Xa$$ The quadratic term gives us the Hessian: $$\langle \nabla^2 f(a)[\delta a],\delta \rangle = (\delta a)^H X (\delta a) \quad\Longrightarrow\quad \nabla^2 f(a) = X.$$ To be clear, this is just a shortcut for a more rigorous derivation, but it works for me.

It is important to note the use of a real inner product $\langle v,w\rangle=\Re(v^Hw)$. The directional derivatives of a real functional must themselves be real, and that is how you get them when the input is complex. So $$f'(a+t\delta a) = t\langle \nabla f(a),\delta a\rangle = t \langle 2Xa,\delta a \rangle = t2\Re(a^HX\delta a).$$