Function composition is defined on Wolfram to be:
'The nesting of two or more functions to form a single new function is known as composition. The composition of two functions $f$ and $g$ is denoted $f \circ g$, where $f$ is a function whose domain includes the range of $g$.'
I would like to ask why isn't composition defined to be something like the following instead:
Function composition, denoted $\circ$, is a partial binary operation on the set of all functions such that $(f \circ g)(x)=f(g(x))$ for all functions $f$, $g$ where $\mathscr{R}(g)\subseteq\mathscr{D}(f)$.
Is it because the set of all functions isn't well defined?
"All functions" is a proper class and can't be represented as a set. However defining composition as a partial binary operation on a class of functions is a legitimate definition.
Indeed if you define composition as a closed associative partial binary operation on a class of morphisms (a function is a type of morphism) you more or less have the category theory definition of composition.