I'm struggling with the following problem. Suppose $\pmb{x}$ and $\pmb{y}$ are vectors of the same length and $\pmb{y}$ is not a function of $\pmb{x}$. What is the following derivative?
$$ \frac{\partial}{\partial \pmb{x}} (\pmb{y} - \pmb{x}) \otimes (\pmb{y} - \pmb{x}) $$
My thought was to write use $\pmb{z} = \pmb{y} - \pmb{x}$ and $\pmb{f} = \pmb{z} \otimes \pmb{z}$ and derive first:
\begin{align} d\pmb{f} &= ((d\pmb{z}) \otimes \pmb{z}) + (\pmb{z} \otimes (d\pmb{z})) \\ &= (\pmb{I} \otimes \pmb{z})d\pmb{z} + (\pmb{z} \otimes \pmb{I})d\pmb{z} \\ &= ((\pmb{I} \otimes \pmb{z}) + (\pmb{z} \otimes \pmb{I}))d\pmb{z} \\ \frac{\partial \pmb{f}}{\partial \pmb{z}} &= (\pmb{I} \otimes \pmb{z}) + (\pmb{z} \otimes \pmb{I}) \end{align}
and then obtain by chain rule:
$$ \frac{\partial}{\partial \pmb{x}} (\pmb{y} - \pmb{x}) \otimes (\pmb{y} - \pmb{x}) = -\left( (\pmb{I} \otimes (\pmb{y} - \pmb{x})) + ((\pmb{y} - \pmb{x}) \otimes \pmb{I}) \right) $$
Which seems sensble. However, this is part of a Hessian I am deriving, and it's corresponding transpose element I derived to be:
$$ -2\left(\pmb{I} \otimes (\pmb{y} - \pmb{x})\right) $$
Which is very similar but not the same. Am I missing something obvious?
Let's clear out some definitions first.
If $f:z \to f(x)$ is a matrix valued function and there is a function $D_f$ such that $$ f(z+h) = f(z) + D_f(z,h) + o(\|h\|) $$ Then $D_f$ is the differential of $f$. If there exists a matrix valued function $A(z)$ such that $$ D_f(z,h) = A(z) h$$ Then $A(z)$ is the derivative of $f$.
(This is sometimes called the first identification theorem; see for instance Magnus and Neudecker, 1999).
In the case at hand, we have $$f(z+h) = (z+h)\otimes (z+h) = \underbrace{z\otimes z}_{f(z)} + \underbrace{(h\otimes z) + (z\otimes h)}_{D_f(z,h)} + \underbrace{h\otimes h}_{o(\|h\|)}$$
So, by the definition $D_f(z,h) = (h\otimes z) + (z\otimes h)$ is the differential of $z\otimes z$. Now we can use the identification theorem to say that, since $$ D_f(z,h) = \big[(I\otimes z) + (z \otimes I)\big]h = A(z) h$$ the matrix $$ A(z) = (I\otimes z) + (z \otimes I)$$ is the derivative of $z \otimes z$.
So in the case at hand, the same reasoning brings you to the correct derivative $$ \frac{\partial}{\partial x}(y-x)\otimes (y-x) = \big(I\otimes(x-y)\big) + \big((x-y) \otimes I\big)$$ which is what you found.
Check the other half of the Hessian: there has to be something wrong there!