Derivative of a vector with respect to a matrix with quadratic term

73 Views Asked by At

I have the following function $u : \mathbb{R}^{n\times m} \mapsto \mathbb{R}^{n}$ given by $$u = Q(a-Q^{\top} \mathbb{1}) = Qa-Q Q^{\top} \mathbb{1} $$

where $Q \in \mathbb{R}^{n \times m}, a \in \mathbb{R}^{m}$ and $\mathbb{1}$ is a vector of all ones with dimension $n$.

I am interested in computing $$\frac{\partial u}{\partial Q}$$ but do not have many ideas. I know it should be $\frac{\partial Qa}{\partial Q} = a \otimes I_n$ but what about the quadratic term?

1

There are 1 best solutions below

0
On BEST ANSWER

Let $v : \mathbb R^{n \times m} \to \mathbb R^n$ be our quadratic term $v(Q) = QQ^{\intercal}1$. We compute the difference: $$ v(Q + H) - v(Q) = (Q + H)(Q + H)^{\intercal}1 - QQ^{\intercal}1 = (QH^{\intercal}1 + HQ^{\intercal}1) + HH^{\intercal}1 $$ It is not hard to show that the derivative of $v$ at some $Q$ is the linear transformation $$ Dv_Q(H) = QH^{\intercal}1 + HQ^{\intercal}1 $$ Using the identity $ \text{vec}(XYZ) = (Z^{\intercal} \otimes X)\text{vec}(Y) $ on both summands above we see that \begin{align*} Dv_Q(H) &= (1^\intercal \otimes Q)\text{vec}(H^\intercal) + (1^\intercal Q \otimes I_n)\text{vec}(H) \\ &= \Big[(1^\intercal \otimes Q)K^{(n,m)} + (1^\intercal Q \otimes I_n)\Big]\text{vec}(H) \end{align*} where $K^{(n, m)}$ is the commutation matrix that transforms $\text{vec}(H)$ to $\text{vec}(H^\intercal)$.

Thus through the isomorphism $\text{vec} : \mathbb R^{n \times m} \cong \mathbb R^{nm}$ you can think of $\frac{dv}{dQ}$ as the matrix $$ (1^\intercal \otimes Q)K^{(n,m)} + (1^\intercal Q \otimes I_n) $$ So as a whole you can think of the derivative of the original function $\frac{du}{dQ}$ as $$ (a^\intercal \otimes I_n) - (1^\intercal \otimes Q)K^{(n,m)} - (1^\intercal Q \otimes I_n) $$ If you want to be in concordance with the post that motivated this question in the first place, you can simply transpose the above.