Suppose $\Lambda\subseteq\mathbb R^d$ is open and $u\in H_0^1(\Lambda,\mathbb R^d)$ with $\nabla\cdot u=0$. Then, $$\nabla\cdot(u\otimes u)=\underbrace{(\nabla\cdot u)u}_{=0}+(u\cdot\nabla)u=(u\cdot\nabla)u\;.\tag 1$$ Now, I've read that the left-hand side "makes sense for more singular vector fields" than the right-hand side of $(1)$.
Why?
I don't see that. Both sides contain the first partial derivatives of $u$.
The reason is not exactly because of the quadratic nature of $u \otimes u$, but rather because it is in "divergence form" when written as $\nabla \cdot( u \otimes u)$. In this case we can understand this term in weak formulation by formally integrating by parts. This is equivalent to formulating things in the sense of distributions.
Let's ignore the other terms for the moment and suppose I want to solve $$ \nabla \cdot (u\otimes u)= f $$ for some given smooth $f$. In this form it looks like I need to actually find a differentiable $u$ in order to make sense of the equations since $$ \nabla \cdot (u\otimes u) = (\nabla \cdot u) u + u \cdot \nabla u $$
On the other hand, if we use the observation that if we multiply the equation by $\varphi \in C^\infty_c$ and integrate by parts, then $$ -\int u\otimes u : D \varphi = \int f \cdot \varphi. $$ The weak formulation of the above problem is that this holds for all such $\varphi$. The advantage of this approach is that it makes sense (i.e. the integrals are well-defined) under the very weak assumption that $u \in L^2$. This allows us to bypass any differentiability assumptions and therefore allows "more singular vector fields."