I have been trying to find the right dimensions of some functions in the equation
$$\begin{cases}-\rho U\cdot \nabla u = -\nabla p + \nabla^2u \\ \nabla \cdot u = 0 \end{cases}$$
as seen here, in this Wikipedia article about the Oseen equations.
$u$ and $U$ are two velocities, which makes me think they are both vectors of the same dimension $d$ (the dimension we are working on) and they depend on the spatial coordinates (and maybe on time as well). So I think $u, U: \mathbb{R}^d\to \Bbb{R}^d$ or $u,U:\Bbb{R}^{d+1}\to\Bbb{R}^d$. $p$ is the pressure, so I would say it is a scalar function that depends on the spatial coordinates as well, but not on time, so $p: \Bbb{R}^d \to \Bbb{R}$.
I believe my struggle has to do with the fact that I don't know what are the meanings of the operators in the context they are being used.
If $u,U$ and $p$ have the dimensions I think they have, then $\nabla \cdot u$ is the only thing that I understand. What does the gradient, $\nabla u$, mean? When $u$ is scalar, $\nabla u$ is a vector. What if $u$ is a vector already? Is $\nabla u$ a matrix?
While trying to understand what is going on I read about dyadics in Wikipedia. It felt like it could make sense here, but then what is the meaning of $U\cdot \nabla u$, if $U$ is a vector and $\nabla u$ is a matrix?
And of course, $\nabla^2$ which I know is the Laplace operator, with which I am also familiar when applied to a scalar function. If $u$ is a vector, then what is $\nabla^2 u$? Is it a vector as well? I think it should be, in order to be summable with $-\nabla p$...
If anyone could point me to something to read/learn in order to be able to understand this, I would be glad. The problem is, I don't even know where to go looking for answers.
The nabla notation is rather abstract: its purpose is to act as shorthand in long expressions with partial derivatives. Think of $\nabla$ as a vector operator, which has components $$ \vec\nabla = \left(\frac{\partial}{\partial_x}, \frac{\partial}{\partial_y}, \frac{\partial}{\partial_z}\right) $$
It does not matter to what kind of function you apply it, the effect is always $$ (\nabla f)_i = \frac{\partial f}{\partial x_i} $$
The question is, what mathematical meaning can be given to the RHS for each particular $f$. If $f$ is a scalar, then each component of $\nabla f$ is a single real function, and together they make a vector (it would be inaccurate to say that each component is a scalar, because they don't transform as scalars under coordinate transformations) - this is called the gradient of $f$. If $\bf f$ is a vector, the partial derivatives act on each of its components independently (this is as if we multiply the vector $\bf f$ with the "scalar" $\partial/\partial x_i$), so each "component" of $\nabla \bf f$ is a triple of real numbers, and so long as you don't make any coordinate transformations, their combined structure can be adequately represented by a matrix - but be careful not to mix up row vectors with column vectors; you should think of the first one ($\nabla$) as column and the second one ($\bf f$) as row to get a matrix.
This way of multiplying vectors is referred to as an outer product to contrast it with the inner (a.k.a. dot) product, which involves summation over indices; in the latter, when the del operator is involved, as in $\nabla \cdot \bf f$, each $\partial/\partial x_i$ acts only on the corresponding $f_i$, and the end result is the sum of all those, a scalar called the divergence of $\bf f$ (in the matrix notation, this time the first is the row and the second is the column, to produce a $1\times 1$ matrix which can represent a scalar).
If you combine more than two vectors, some of which can be $\nabla$'s, it is very important also to keep track of what is multiplied with what, and in what way. Just as with ordinary vectors, $\bf (a\cdot b)c$ is very different from $\bf a(b \cdot c)$; the reason they are different is not that these multiplications aren't associative (they are); the reason is that the dot product is not the same kind of product as the product of scalar with vector (in fact, the latter is actually more like an outer product), so we can't just switch them around. With two nablas, for example, we can have $$ (\nabla \cdot \nabla) {\bf f} = \left(\frac{\partial^2}{\partial_x^2} + \frac{\partial^2}{\partial_y^2}+ \frac{\partial^2}{\partial_z^2}\right) {\bf f} = \nabla^2 \bf f$$
This is an example of Laplace's operator, which is a scalar operator, acting on a vector - it just acts on each component separately, just as $a{\bf u} = (au_x, au_y, au_z)$ for any scalar $a$. But it is not the same thing as $\nabla (\nabla \cdot \bf f)$, the gradient of the divergence of $\bf f$.
In the case of $\bf U\cdot \nabla \bf u$, the order in which multiplications are carried out is irrelevant, so long as the correct type of multiplications is performed for each. But, as anyone who has done large matrix computations will tell you, it is more efficient to work with lower rank objects. It is also easier to visualize the intermediate objects that way. So I would argue that the better way to imagine it is $$ ({\bf U} \cdot \nabla) {\bf u} = \left(U_x\frac{\partial}{\partial_x} + U_y\frac{\partial}{\partial_y}+ U_z\frac{\partial}{\partial_z}\right) \bf u $$
The object in the parenthesis doesn't have components; it is the scalar (operator) $\bf U \cdot \nabla$; when multiplied with the vector $\bf u$ it produces a vector, and the multiplication proceeds according to the principles of multiplication of vector with scalar - each component of $\bf u$ gets acted on by that operator independently of the others.
In the matrix notation, this would be written with $\bf U$ as a row vector, $\nabla$ as column, and $\bf u$ as row. The result will be a row vector with the same structure as the row vector $\bf u$.