I have an equation
$\mathbf{x_n} y_n - \mathbf{x_n}\sigma(\mathbf{x_n^Tw})$
where $\mathbf{x_n}, \mathbf{w}$ are vectors, $y_n$ is a scalar and $\sigma(\mathbf{x_n^Tw}) = \frac{1}{1 + e^{-\mathbf{x_n^Tw}}}$
What is its derivative wrt $\mathbf{w}$ ?
$\frac{\partial}{\partial w} (\mathbf{x_n} y_n - \mathbf{x_n}\sigma(\mathbf{x_n^Tw}))$
The solution states
$ \mathbf{x_n}\sigma(\mathbf{x_n^Tw})(1-\sigma(\mathbf{x_n^Tw}))\mathbf{x_n^T}$
I know that
$\frac{\partial\sigma(\mathbf{x_n^Tw})}{\partial x} = \sigma(\mathbf{x_n^Tw})(1-\sigma(\mathbf{x_n^Tw}))$
and that a derivative of a vector wrt a vector is a matrix but how do I know where to place $\mathbf{x_n^T}$. This is a simpler problem but what if I have a very long equation ? I need to know how $\mathbf{x_n^T}$ is placed
The subscripts are just visual clutter, so let's drop them and define the variables $$\eqalign{ x &= x_n,\,\,\,\lambda = y_n \cr \beta &= x^Tw &\implies d\beta = x^T\,dw \cr \sigma &= \sigma(\beta) &\implies d\sigma = \sigma(1-\sigma)\,d\beta \cr }$$ where I've used the convention where uppercase latin letters are matrices, lowercase latin letters are vectors, and greek letters are scalars.
Now the function of interest is $$\eqalign{ z &= x\,(\lambda - \sigma) }$$ Let's find its differential and Jacobian $$\eqalign{ dz &= -x\,d\sigma \cr &= \sigma(\sigma-1)\,x\,d\beta \cr &= \sigma(\sigma-1)\,xx^T\,dw \cr J = \frac{\partial z}{\partial w} &= \sigma(\sigma-1)\,xx^T \cr }$$