Suppose $\mathrm{a}$ is $N\times 1$ known complex vector, and we need to solve this following optimization problem with the gradient descent method:
$\mathrm{X}=\underset{\mathrm{X}}{\mathrm{arg\,min}}$ $\mathrm{a}^H\mathrm{X}(\mathrm{X}^H\mathrm{X})^{-1}\mathrm{X}^H\mathrm{a}$
where $\mathrm{X}$ is the unknown $N\times R$ complex-valued matrix, where $N>R$
The gradient of the objective function if all variables are real ($\mathrm{a}^T\mathrm{X}(\mathrm{X}^T\mathrm{X})^{-1}\mathrm{X}^T\mathrm{a}$) can be given by (from Matrix cookbook)
${\bigtriangledown}_{\mathrm{X}}=(\mathrm{I}_N-\mathrm{X}(\mathrm{X}^T\mathrm{X})^{-1}\mathrm{X}^T)\mathrm{a}\mathrm{a}^T\mathrm{X}(\mathrm{X}^T\mathrm{X})^{-1}$
My questions are:
1- Is it correct to use this gradient equation for the original objective function but after replacing the transpose operator $(.)^T$ by the hermitian operator $(.)^H$ ?
i.e. ${\bigtriangledown}_{\mathrm{X}}=(\mathrm{I}_N-\mathrm{X}(\mathrm{X}^H\mathrm{X})^{-1}\mathrm{X}^H)\mathrm{a}\mathrm{a}^H\mathrm{X}(\mathrm{X}^H\mathrm{X})^{-1}$
2- If it is not correct, how can I get the gradient of the original problem ?
Note: for simplicity the constraints are removed