So far I've met three different definitions of composed mapping.
Definition 1)
Let $\ f:X\to Y,\ x\mapsto f(x); \ g:Y\to Z,\ y\mapsto g(y)$ be two mappings, then the composed mapping of $f\ $and$\ g$ is defined as $$g\circ f:X\to Z, \ x\mapsto g(f(x)).$$
We can see the first definition requires $Ran(f)=Dom(g)$.
Definition 2)
Let $\ f:X\to Y,\ x\mapsto f(x); \ g:W\to Z,\ w\mapsto g(w)$ be two mappings, and $Y\subset W.\ $Then the composed mapping of $f\ $and$\ g$ is defined as $$g\circ f:X\to Z, \ x\mapsto g(f(x)).$$
The second definition only requires $Ran(f)\subset Dom(g)$.
Definition 3)
Let $\ f:X\to Y,\ x\mapsto f(x); \ g:W\to Z,\ w\mapsto g(w)$ be two mappings, and $f^{-1}(W)\neq \emptyset.\ $ then the composed mapping of $f\ $and$\ g$ is defined as $$g\circ f:f^{-1}(W)\to Z, \ x\mapsto g(f(x)).$$
The third definition only requires that $f^{-1}( W)\neq \emptyset.\ $
Obviously these three definitions are not equivalent to each other. I dislike the idea that one term could have several inequivalent definitions, and I also don't know which one is better.
Question : Why do people give so many definitions of composed mappings, and which one should I use?
It depends on the context. The first definition is the most simple but also a bit special. If you have functions $f:\mathbb{R}\to\mathbb{R}$ and $g:\mathbb{R}\to\mathbb{R}$, then by first definition you can compose the functions and everything is fine. But definition (2) also works, since $\mathbb{R}\subset\mathbb{R}$. Because $f$ is defined on $\mathbb{R}$ you get $f^{-1}(\mathbb{R})=\mathbb{R}$ and $(3)$ gives the same definition. This argues also that if (1) holds, then also (2) and (3) holds.
But sometimes you like to compose functions, where not each space are the same. I.e. $g$ can have singularities or is not well defined everywhere. But if the range of $f$ contains just points in the domain of $g$, then (2) is a natural way to define a composition. Further you see if $Y\subset W$, then you have $f^{-1}(W)=f^{-1}(Y)=X\neq\emptyset$ since $f(X)\subset Y$. You see if (2) holds, then also (3) holds. That's why (2) is a special case of (3)
By the previous arguments you can see, that (3) is the most generell definition. You are able to define the composition even is the range of $f$ contains points, which are not in the domain of $g$. But in that case you have to restrict the domain of $f$. On the other hand you have to check with $f^{-1}(W)\neq\emptyset$, that $W$ is not totally outside of the range of $f$.
Sometimes definition are simplified in order to have less work. If you like to compose functions on the most generell case. Then you have to choose (3). But you can also use (1) or (2) if you restrict the domain of $g$ and domain of $f$ properly. So that is also not a bad definition or less worth. And by using restrictions you can in some sense say, that the definitions are equivalent.