Jacobian and chain rule

2.2k Views Asked by At

I am trying to work out the derivative of a scaler ($J$) with respect to a vector ($x$) via chain rule. Below I laid out the steps in between, I find the matrix shape only checks out when I consider $x$ is a row vector. I labeled the details as below.

Red is the shape of a vector, and blue is the shape of the derivative matrix.

enter image description here

I think I still don't understand Jacobian, or derivative of a vector w.s.t. a vector well enough, please correct me.

Answer: the RHS of the second equation should be transposed since the left hand has been.

2

There are 2 best solutions below

2
On BEST ANSWER

If you compare the RHS of your expressions they are identical, however, the LHS you have affected by transposition. When you transpose a vector $$ z=Ax $$ then the transpose is $$ z^T=(Ax)^T=x^TA^T. $$ Correspondingly, the RHS must change.

P.S. The first line is better for understanding and more natural, the second line is a burden from the vector analysis to have the gradient to be a vector instead of a row.

0
On

Just uploaded a version that illustrate the right way to look at the chain rule with proper transpose operations. Convention: $x$ is a column-vector, $x^T$ is its corresponding row vector.

enter image description here