I'm reading a statistical paper by Breslow and Clayton (1993) and there is a section:
... , we use in practice the REML version (Patterson and Thompson 1971): $ql_1(\hat{\alpha}(\theta),\theta)=-\frac{1}{2}\log{\mathbf{V}}-\frac{1}{2}\log{|\mathbf{X^t\mathbf{V}}^{-1}\mathbf{X}|}-\frac{1}{2}(\mathbf{Y}-\mathbf{X}\hat{\mathbf{\alpha}})^t\mathbf{V}^{-1}(\mathbf{Y}-\mathbf{X}\hat{\mathbf{\alpha}})$ ----(13)
..., and differentiate (13) with respect to the components of $\theta$ (note that $\theta$ is a vector) to obtain estimating equations for the variance parameters:
$\frac{1}{2}[(\mathbf{Y}-\mathbf{X}\alpha)^t\mathbf{V}^{-1}\frac{\partial{V}}{\partial{\theta_j}}\mathbf{V}^{-1}(\mathbf{Y}-\mathbf{X}\alpha)-tr(\mathbf{P\frac{\partial{V}}{\partial{\theta_j}}})]=0$.
I don't have a very strong math background so I would really appreciate it if anyone could explain to me how to derive $\frac{\partial{V}}{\partial{\theta_j}}$. $\mathbf{V}$ is a n x n matrix and $\mathbf{\theta}$ is a g x 1 vector.
If anyone would like a bit more context, the title of the paper is "Approximate Inference in Generalized Linear Mixed Models". Thanks.