Simple question but what is the definition that allows me to take (assume everything that is also needed for this proof is here) $(f \circ g) \circ h(w)$ and turn it into $f(g(h(w)))?$
I see this used a lot in function proofs, but I'm not sure exactly which definition it comes from.
Associativity follows almost immediately from the definition of composition of functions.
Since $(f \circ g)(x) = f(g(x))$ we have
$((f \circ g) \circ h)(w) = (f \circ g) (h(w))= f(g(h(w))) \space \forall w \in W$
where $W$ is the domain of $h$, and
$(f \circ (g \circ h))(w) = (f(g \circ h))(w) = f(g(h(w)))\space \forall w \in W$
Since $((f \circ g) \circ h)(w) = ((f \circ (g \circ h))(w) \space \forall w \in W$ we can conclude that $(f \circ g) \circ h = f \circ (g \circ h)$