The derivative of a matrix-valued function with respect to a matrix
Chain rule and vector-matrix calculus
Fundamental object with which we will work:
$x=[a,b,c]^T$ - $3 \times 1$ vector
Now let's keep in mind an important properties:
$\frac{d}{dx}(x \cdot x^T)=x \otimes I + I \otimes x$
$(A \otimes B)v=vec(BVA^T)$
Let's take expression:
$q_1=x \cdot x^T \cdot x$
Let's find the derivatives with respect to the components of the vector $x$ separately. We get a series of terms:
$\frac{d}{da}q_1=\frac{d}{da}x \cdot x^T \cdot x + x \cdot \frac{d}{da}x^T \cdot x + x \cdot x^T \cdot \frac{d}{da}x$
$\frac{d}{db}q_1=\frac{d}{db}x \cdot x^T \cdot x + x \cdot \frac{d}{db}x^T \cdot x + x \cdot x^T \cdot \frac{d}{db}x$
$\frac{d}{dc}q_1=\frac{d}{dc}x \cdot x^T \cdot x + x \cdot \frac{d}{dc}x^T \cdot x + x \cdot x^T \cdot \frac{d}{dc}x$
Now let's try to combine these expressions into one vector-matrix form:
$\begin{bmatrix} \frac{d}{da}x \cdot x^T \cdot x \\ \frac{d}{db}x \cdot x^T \cdot x \\ \frac{d}{dc}x \cdot x^T \cdot x \end{bmatrix} \rightarrow x^T \cdot x \cdot I$
$\begin{bmatrix} x \cdot \frac{d}{da}x^T \cdot x \\ x \cdot \frac{d}{db}x^T \cdot x \\ x \cdot \frac{d}{dc}x^T \cdot x \end{bmatrix} \rightarrow x \cdot x^T$
$\begin{bmatrix} x \cdot x^T \cdot \frac{d}{da}x \\ x \cdot x^T \cdot \frac{d}{db}x \\ x \cdot x^T \cdot \frac{d}{dc}x \end{bmatrix} \rightarrow x \cdot x^T$
Thus, the resulting expression will look like:
$\frac{d}{dx}q_1=x^T \cdot x \cdot I + 2 \cdot x \cdot x^T$
My question is this: as long as we work with the individual components of the vector, the dot product in the expression is preserved + we use the chain rule. Is there any logic/rule/algorithm for combining such simple terms into one vector matrix operation? When does the transition from the scalar product to the Kronecker product, etc., occur?
$ \def\d{\delta}\def\o{{\tt1}}\def\p{\partial} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\vc#1{\op{vec}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $The derivative of a vector $x$ with respect to its $k^{th}$ component is the $k^{th}$ cartesian basis vector, i.e. $$\eqalign{ \grad{x}{x_k} &= e_k \\ }$$ Taking the $i^{th}$ component of this equation produces the result in index notation $$\eqalign{ e_i^T\gradLR{x}{x_k} &= e_i^Te_k \qiq \grad{x_i}{x_k} = \d_{ik} \qquad\qquad \\ }$$ Applying these results to the matrix $\,xx^T$ yields $$\eqalign{ \grad{\,(xx^T)}{x_k} &= xe_k^T + e_kx^T \qiq \grad{\,(x_ix_j)}{x_k} &= x_i\d_{jk} + \d_{ik}x_j \\ }$$ So the gradient of the matrix $xx^T$ with respect to the vector $x$ is a third-order tensor (which is why it carries 3 free indexes) which cannot be rendered in matrix notation.
So your first "important property" is simply not true. The result that you've misquoted is actually $$\eqalign{ \grad{\vc{xx^T}}{x} &= {x\otimes I + I\otimes x} \\ }$$ You may not think the presence of that $\vc{}$ operator is important, but it is.