How to differentiate the product of vectors and matrices?

118 Views Asked by At

Suppose I have $t$, an $m \times n$ matrix of constants and $w$, an $n \times 1$ column vector.

I want to differentiate $A$, the function of $w$ defined as $$ A(w) = w^Tt^Ttw. $$ I wish to use product rule on the two quantities $w^Tt^t$ and $tw$ so that $$ \frac{\partial A}{\partial w} = \frac{\partial (w^Tt^T)}{\partial w}tw + w^Tt^T\frac{\partial (wt)}{\partial w} =ttw+w^Tt^Tt $$ However the first term, $ttw$ is not dimensionally consistent. What is the issue here?

3

There are 3 best solutions below

0
On BEST ANSWER

I believe the issue is that, when working with matrix calculus:

$$ \frac{\partial (My)}{\partial x} \ne \bigg(\frac{\partial M}{\partial x}\bigg)y + M\bigg(\frac{\partial y}{\partial x}\bigg)$$

If you work with differentials, however, the product rule reads:

$$d(My) = \big(dM\big)\,y + M\,\big(dy\big) $$

Thus, introducing the Frobenius inner product as:

$$ A:B = \operatorname{tr}(A^TB)$$

with the following properties derivied from the underlying trace function

$$\eqalign{A:BC &= B^TA:C\cr &= AC^T:B\cr &= A^T:(BC)^T\cr &= BC:A \cr } $$

You can work as you usually do with vectors. Your problem becomes:

$$\eqalign{ f&= w^T T^T T w\\ &= Tw : Tw\\ df &= Tdw : Tw + Tw: Tdw\\ &= Tw : Tdw + Tw : Tdw\\ &= 2(Tw) : Tdw\\ &= 2(T^TTw):dw }$$

Since

$$df = \left(\frac{\partial f}{\partial w}\right):dw$$

You can identify: $$\frac{\partial f}{\partial w}= 2(T^TTw)$$

3
On

Use coordinates! Let $t\in\Bbb R^{n\times n}$ and $A\colon\Bbb R^{n}\to \Bbb R$ defined as $$ A(w) = w^Tt^T\,tw=\sum_{i=1}^n (tw)_i^2=\sum_{i=1}^n\Big(\sum_{j=1}^n t_{i,j}w_j\Big)^2\qquad \forall w\in \Bbb R^n. $$ It follows that for every $k = 1,\ldots,n$ and $w\in \Bbb R^n$ it holds $$\frac{\partial A(w)}{\partial w_{k}}=\sum_{i=1}^n (tw)_i^2=2\sum_{i=1}^n\Big(\sum_{j=1}^n t_{i,j}w_j\Big)\frac{\partial}{\partial w_{k}} \Big(\sum_{j=1}^n t_{i,j}w_j\Big)=2\sum_{i=1}^n\Big(\sum_{j=1}^n t_{i,j}w_j\Big)t_{i,k}\\=2\sum_{i=1}^n\Big(\sum_{j=1}^n t_{i,j}w_j\Big)t_{i,k}=2\sum_{j=1}^n\Big(\sum_{i=1}^nt^\top_{k,i}t_{i,j}\Big)w_j=(2t^\top t\,w)_k.$$

0
On

this phenomenon means that these two $m*n$ matrix are merged together at first!

since $A(w)=w^Tt^Ttw$, so you can assume a Dirac symbol satisfy the third boundary condition of Green function, for example, pick $[\alpha{G}\frac{\partial{u}}{\partial{n}}+\beta{Gu}]_{\sum}=G\varphi$, then write its solution as $u(r)=\int\int\int_{T}G(r,r_0)f(r_0)dV_0+\int\int{\varphi(r_0)\frac{\partial{G}(r,r_0)}{\partial{n_0}}}dS_0$

so you can define a Dirac operator $\Delta{G}=\delta(r-r_0)$ to exchange the source point and field point by $r$ and $r_0$

after combining above, i suggest you to view these two $m*n$ matrix as a whole part, by their shift between them with $r$ and $r_0$, that is also the reason why your first term's derivative do not has obvious difference with your second derivative! thank you!