Let $f, g$ be smooth real valued functions of one variable. Then the chain rule says that $$h' = (f \circ g)' = (f' \circ g) \cdot g'.$$
Now, more generally, let $u: \mathbb{R}^m \rightarrow \mathbb{R}^n$ and $v: \mathbb{R}^k \rightarrow \mathbb{R}^m$ be smooth, and suppose $a$ is some vector in $\mathbb{R}^k$. Then the chain rule says that: $$D_a(u \circ v) = D_{v(a)}u \circ D_av$$ where $D_av$ denotes the directional derivative of $v$ along $a$.
Why are the two chain rules not the same? Is there a geometric argument that would explain this? How would one show that in the case where $m = n = k = 1$ that the two chain rules are equivalent?