What is the derivative of this matrix expression with respect to $\theta_k$ \begin{equation} \begin{aligned} \mathcal{J}(X, \theta) &= {\bf trace}\left( XX^TP(\theta)^{-1} \right) +{\bf trace}\left( (Y-H(\theta)X)(Y-H(\theta)X)^T \Sigma^{-1} \right)\\ & = X^TP(\theta)^{-1}X + (Y-H(\theta)X)^T \Sigma^{-1} (Y-H(\theta)X)^T \end{aligned} \end{equation}
$X$ and $Y$ are vectors
$\theta$ is a vector with entries $\theta_k$
$P(\theta)$ and $H(\theta)$ are matrices constructed using some or all of the entries of $\theta$ and possibly other constants.
The matrix $\Sigma$ is an invertible known constant matrix
All vectors and matrices have compatible dimensions.
I tried to use The Matrix Cookbook to calculate this derivative. Here is my result:
\begin{equation} \begin{aligned} \frac{\partial \mathcal{J}(X,\theta)}{\partial \theta_k } =& - {\bf trace} \left( X X^T P(\theta)^{-1} \frac{\partial P(\theta)}{\partial \theta_k}P(\theta)^{-1} \right) \\ & - 2\; {\bf trace} \left(\frac{\partial H(\theta)}{\partial \theta_k} X Y^T \Sigma_e^{-1}\right)\\ &+ 2\; {\bf trace} \left(\frac{\partial H(\theta)}{\partial \theta_k} \Sigma_e^{-1} H(\theta) X X^T\right) \end{aligned} \end{equation}
Is this result correct? Can you explain if there is a mistake? Also I would like to know if there is a better way to write this derivative.
I haven't checked carefully, but your final result "looks" right.
When taking derivatives of matrix, it's always a good idea to put in the indexes and use Einstein convention (i.e. $a_i b_i$ is understood as $\sum_i a_i b_i$).
$X^T P(\theta)^{-1} X$ is the same as $\sum_{ij} X_i P^{-1}_{ij}(\theta) X_j = X_i P^{-1}_{ij}(\theta) X_j$ (Einstein convention in the last step). Thus all we need to do is take the derivative of $P^{-1}(\theta)$. Since $P^{-1}(\theta) P(\theta) = I$ or $P^{-1}_{ij}(\theta) P_{jk}(\theta) = \delta_{ik}$, $\partial_{\theta_k} P^{-1}_{ij}(\theta) P_{jk}(\theta) + P^{-1}_{ij}(\theta) \partial_{\theta_k} P_{jk}(\theta) = 0$, which is equivalent to $\partial_{\theta_k} P^{-1}(\theta) P(\theta) + P^{-1}(\theta) \partial_{\theta_k} P(\theta)= 0$ and thus $\partial_{\theta_k} P^{-1}(\theta) = - P^{-1}(\theta) \partial_{\theta_k} P(\theta) P^{-1}(\theta)$.
The derivative of $H(\theta)$ can be taken straight-forwardly.
By the way, you do not need to "trace" so many times; the expression on the second line does not contain any trace, and it's the most natural form.