I was wondering if there was a general chain rule that could be applied to something like this
$$\frac{\partial}{\partial \mathbb{x}}\Vert A\mathbb{x} - \mathbb{y} \Vert_{2}^{2}$$
where $$ \mathbb{x} \in \mathbb{R}^n, \mathbb{y} \in \mathbb{R}^m, A \in \mathbb{R}^{m \times n} $$
I tried using the chain rule from how I knew it from a typical calculus course: $$ \frac{\partial}{\partial x}f(\mathbb{g}(x)) = \frac{\partial f}{\partial \mathbf{g}}\frac{\partial \mathbf{g}}{\partial \mathbf{x}} $$
But when attempting this, I ended up with $$ 2(A\mathbb{x} - \mathbb{y})A $$
which doesn't dimensionally make sense. I understand that matrix multiplication is order-dependent, but I couldn't figure out a general pattern that would resemble the common chain rule we learn in calculus. Can someone please give me an explanation?