The Derivative of a Vector

131 Views Asked by At

It is stated in this video that some books attempt to define the derivative of a vector, with the added caveat that tensor notation is a much better approach. Presumably, this is a different concept than the derivative of a vector-valued function, as the latter is uncontroversial and ubiquitous. What is this unpalatable process used by such books to take the derivative of a vector, and a vector transposed?

1

There are 1 best solutions below

2
On BEST ANSWER

The linked video appears to be a discussion about minimizing $$\eqalign{ f(x) &= \tfrac{1}{2}x^TAx - x^Tb}$$ Here's how one might solve it using matrix notation.
First find the differential and gradient of the function. $$\eqalign{ f &= \tfrac{1}{2}x^TAx - b^Tx \cr df &= \tfrac{1}{2}(dx^TAx+x^TA\,dx) - b^Tdx \cr &= (x^TA -b^T)\,dx \cr &= (Ax-b)^T\,dx \cr \frac{\partial f}{\partial x} &= Ax - b \cr }$$ Then set the gradient to zero and solve $$(Ax-b)=0 \implies x=A^{-1}b$$ So it's not really as bad as the lecturer is making it out to be.

The same calculation in index/tensor notation looks like $$\eqalign{ f &= \tfrac{1}{2}x_iA_{ij}x_j - b_ix_i \cr \frac{\partial f}{\partial x_k} &= \tfrac{1}{2}(\delta_{ik}A_{ij}x_j+x_iA_{ij}\delta_{jk}) - b_i\delta_{ik} \cr &= \tfrac{1}{2}(A_{kj}x_j+x_iA_{ik}) - b_k \cr &= \tfrac{1}{2}(A_{kj}x_j+A_{ki}^Tx_i) - b_k \cr &= A_{kj}x_j - b_k \cr }$$ For dealing with scalars, vectors, and matrices (as in this problem) the two notations are roughly equivalent. The real power of index/tensor notation is in dealing with higher-order tensors.