It is easy to prove that function composition is associative. I think of $f\circ (g\circ h)$ as applying first $h$, then $g$ and finally $f$ to an element of the domain of $h$. But I'm having difficulty thinking about what $(f\circ g)\circ h$ means intuitively, and therefore it is not clear to me that function composition is associative.
If you can provide a different way to look at function composition and/or intuitive examples it'd be appreciated.
For me, it's intuitively easier to think of functions that produce numbers, and real numbers at that, so let's assume that $f$, $g$, and $h$ are all real-valued functions.
Then $f(u)$ is a number produced by $f$ for the input $u$. Where does $u$ come from? Well, who cares? From $f$'s perspective, $u$ itself is just another number. That it happened to be computed from another function whose name is $g$ is not relevant to how $f$ is evaluated to produce the value $f(u)$. Similarly, if $u = g(v)$, then $g$ doesn't care where $v$ comes from - it's just another number that, when provided to $g$, gives $u$ as the output. The fact that $v = h(x)$ is $h$'s problem, not $g$'s or $f$'s.
So the composition operation - can we intuit associativity? Well, what is $f \circ g$? It's a function of $g$'s argument, isn't it, at the end of the day? If, as above, $f = f(u)$, then $f \circ g$ says "replace $u$ with $g$". Now, where does $g$'s value come from? It's argument, which we've called $v$. So really, couldn't we go one step further and replace $g$ with the function of $v$ that computes $g(v)$?
Couldn't we have done that in either order - first replace all $u$'s with $g$'s, then sub in the function that $g$ is computed from, or first substitute the function that $g$ is computed from in $g$ itself, and then sub that into $f$ wherever you see a $u$?
It gets a bit stranger when we aren't "subbing in" numbers anymore, since functions are a lot more general than that. So think of it as the following:
The operation of a function depends only on its arguments. Whether you determine what those arguments are directly, or indirectly (through composition) is not the function's problem. The point is that the argument comes from somewhere. So you can either compute the function to the argument, and THEN figure out where it came from, or first figure out where it came from, get its value, and compute the function. They are the same.