Is there a rule (or maybe a rule of thumb) that provides guidance for choosing the layout convention (see Layout conventions, wiki) to use when dealing with matrix derivatives?
I found a hint in this answer here to a related question but I could hardly find any well justified answer.