My lecture notes differentiate a Lagrangian function. lecture notes
They say the derivative of $\mu^T(Ax-b)$ where $A$ is a matrix and $\mu , x$ are vectors is $A^T \mu$, but I don't understand why it isn't $\mu^TA$. We haven't previously studied any matrix calculus rules before or how to differentiate matrices so I just wanted to see if there is a rule/reason for this.
Thank you!
$\mu^TA$ is just $(A^T\mu)^T$ and for your purposes you need add $n \times 1$ sized vectors instead of $1\times n$