Sometimes, (especially in physics), it's common to see mathematical relations manipulated and/or derived by separating "operators" from the things they "act on." I can usually keep up with and follow derivations when reading along in the book, but it bothers me that I don't really understand how it's "justified" to do that.
A classic example would be:
$$ \bigtriangledown = \left<\frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z}\right>$$
We can "apply" the operator in different ways, and I can see intuitively how it works - it appears that "multiplying" the operator by a "thing" is what "applies". This is how we can write:
$$ \bigtriangledown \cdot F = \frac{\partial U}{\partial x} + \frac{\partial V}{\partial y} + \frac{\partial W}{\partial z}$$
if
$$ F = U\hat{x} + V\hat{y} + W\hat{z}$$
And similarly with $\bigtriangledown \times$ to define the curl.
But what exactly are we allowed to do with operators and what are we not allowed to do? Is there a name for this kind of treatment? What "are" operators and what rules do they obey? For example, it seems obvious that the "square root" of an operator wouldn't make any sense. Furthermore, it seems to be a given that "squaring" a derivative operator turns it into a "second derivative" operator.
How can I learn more about this?
You may consider an operator as a function of functions, i. e., an operator is a function which arguments are functions itself. For example, if $f\in C^0([0,1)]$ you may declare a function $T\colon C^0([0,1])\to R$ which arguments are functions via $$T(f):=\int_0^1 f(x)\,dx.$$ So the term “operator” reflects that we're dealing with a function of functions as we call a function a “map” in a geometrical context.