I started reading about monotone operators in Zeidlers' book on Nonlinear functional analysis:
The operator $A:X \to X^*$ is monotone on the reflexive Banach space $X$ if: $\langle Ax - Ay, x-y\rangle \geq 0$ for all $u,v \in X$.
This definition (if I get it right) is supposed to be a generalization of a monotone function $f: \mathbb{R} \to \mathbb{R}$. He proceeds with an example:
Set $X=\mathbb{R}$ and $F(u)=Au$. Then $X^*= \mathbb{R} $ and: $\langle Ax - Ay, x-y\rangle =\big(F(u) -F(v) \big)(u-v)$.
But I'm thinking if $F$ is strictly decreasing and we have $v<u, u,v\in \mathbb{R}$ then $u-v >0$ while $F(u) -F(v)<0$, so $\big(F(u) -F(v) \big)(u-v)<0$. What am I missing here? If this were a generalization of a monotone function of $\mathbb{R}$ it should be positive for all $u,v \in \mathbb{R}$ I think? If someone could help me and explain where the definition comes from I'd also be very grateful.
The term "monotone" is also used for maps between posets, to refer to a map that preserves the partial order (including direction). If you want a direction-reversing map, you'd call it anti-monotone (and the same would be true in functional analysis). Ultimately, it's $\Bbb{R}$ that is the odd one out here!
Ultimately, the "is it increasing or decreasing?" question adds ambiguity to the term. It might be justified if we very often need to refer to a function that is increasing or decreasing, without specifying one or the other (e.g. a theorem from real analysis: a continuous real function is strictly monotone if and only if it is injective), but this is not really the case. Most of the time, we are obligated to specify "monotone increasing" or "monotone decreasing". It makes it inefficient terminology! I think it's better to reserve "monotone" for monotone increasing, personally.
Let's talk about monotonicity in functional analysis. I'm not certain, but I would guess that monotonicity was originally observed in the subderivatives of convex functions. Recall that, if $f : X \to [-\infty, \infty]$ is a proper convex function, we define the subgradient at a point $x$ in the domain of $f$ (i.e. $f(x) \in \Bbb{R}$) by $$\partial f(x) = \{x^* \in X^* : \forall y \in X: x^*(y - x) \le f(y) - f(x)\}.$$ This is a monotone map, as, given $x^* \in \partial f(x)$ and $y^* \in \partial f(y)$, we have both: \begin{align*} x^*(y - x) &\le f(y) - f(x) \\ y^*(x - y) &\le f(x) - f(y). \end{align*} Together, they imply, $$x^*(y - x) \le f(y) - f(x) \le y^*(y - x) \implies (y^* - x^*)(y - x) \ge 0.$$ This property enforced that subgradients of real convex functions must be monotone increasing in an order sense, so naturally, this property inherited similar nomenclature.
Here is an illustration of monotonicity in $\Bbb{R}^2$:
Here, $T : \Bbb{R}^2 \rightrightarrows \Bbb{R}^2$ is a monotone set-valued map. In order to satisfy monotonicity, we need (for all points $u, v \in \Bbb{R}$) the image of $u$ and the image of $v$ to be separated by the perpendicular hyperplane to $v - u$, specifically with $Tv$ belonging to the same side as $v$, and $Tu$ belonging to the same side as $u$ (if we reversed this, then we would have $\langle u - v, Tu - Tv \rangle \le 0$ instead). Also note: the red hyperplanes are only parallel to $\{u - v\}^\perp$, and may not, in either case, pass through the origin.
In the more general Banach space version, the images $Tu$ and $Tv$ are subsets of the dual $X^*$. If we let $x = u - v$, the hyperplane separating them is parallel to the kernel of $\hat{x} \in X^{**}$, the embedding of $x \in X$ into the second dual. Note that reflexivity is not really required for this definition.
Something to remember is that monotonicity is not an especially interesting property by itself; it's just a stepping stone towards maximal monotonicity. This is also a property that subgradients of convex functions possess (indeed, they have an even stronger form: maximal cyclic monotonicity). Maximal monotone operators are a current research topic, and have a myriad of interesting properties.