I am trying to find the critical points of an optimization problem by taking the derivatives of a complicated function. One of the terms in the objective function involves taking the derivative of a partitioned matrix with respect to a sub-matrix of it. So the problem can be casts as given below:
Suppose $X$ is partitioned as \begin{bmatrix} x_1 & X_2^\top \end{bmatrix}
where $x_1$ is a $m \times 1$ vector. How to find the derivative of $$ \text{tr}(A X) $$ with respect to $X_2$?
$ \def\TR{\operatorname{trace}} \def\m#1{\left[\begin{array}{c|c}#1\end{array}\right]} \TR(AX)$ should not be thought of as a function, but rather as a Frobnius product $(:)$ or as an elementwise sum $$f \;=\; \TR(AX) \;=\; (A^T:X) \;\doteq\;\sum_j\sum_k A_{jk}^T\,X_{jk}$$ You can partition $X$ any way you wish, as long as you partition $A^T$ the same way, e.g. $$\eqalign{ X &= \m{ x_1 & x_3 & \ldots \\\hline x_2 & x_4 & \ldots \\ } \qquad A^T = \m{ a_1^T & a_3^T & \ldots \\\hline a_2^T & a_4^T & \ldots \\ } \\\\ f &= \sum_i\:(a_i^T:x_i) \;=\; \sum_i\:\TR(a_ix_i) \\ }$$ The gradient wrt any single partition is obviously the corresponding partition of $A^T$ $$ \frac{\partial f}{\partial x_i}=a_i^T \qquad\qquad \frac{\partial f}{\partial X_{jk}}=A_{jk}^T $$