In Lee's Introduction to Smooth manifolds, he asserts the following:
If $X$ is a smooth vector field on $M$, and $f$ is a smooth real-valued function defined on an open subset $U \subset M$, we obtain a new function $Xf:U \to \mathbb R$ by defining $Xf(p) = X_pf$. This bothers me since $X_p$ is a derivation at point $p$ of $M$, it should take functions which is defined on the whole $M$, rather than an open subset of $M$. Clearly we can take advantage of the extension lemma to extend $f$ to $\tilde f$ defined on the whole $M$ and define $Xf(p) = X \tilde f(p)$.
Am I right about this? Why bother to start from an open subset of $M$ instead of $M$ itself from the very beginning?
Many natural functions on a manifold $M$ are defined only on an open subset $U$ of $M$ (the most basic example being coordinate systems). The extension lemma is a useful technical tool but it is not constructive so you want to understand in advance which concepts are local and which concepts are global. Given $f \colon U \rightarrow \mathbb{R}$, you can define the directional derivative of $f$ at a point $p \in U$ in the direction $v_p \in T_pM$ by considering $v_p$ as a global derivation of $C^{\infty}(M)$, extending $f$ to a global function $\tilde{f}$ and then computing $v_p(\tilde{f})$ but this is quite pointless because the number $v_p(\tilde{f})$ will be independent of the extension chosen and you won't be able to compute anything with this definition because you won't be able to construct a single extension $\tilde{f}$ to the whole of $M$.
If you consider $v_p \in T_pM$ as an equivalence class of curves, you can always find a representative of $v$ as a curve $\alpha \colon I \rightarrow M$ such that $\alpha(I) \subseteq U$ and then $v_p(f)$ will be just
$$ \frac{d}{dt} f(\alpha(t))|_{t = 0} $$
and this is a definition which can actually be applied to compute something.
From a higher point of view, a tangent vector $v \in T_pM$ is a derivation of the ring $C^{\infty}_p(M)$ consisting of germs of smooth functions at $p$ and so you can apply it to any smooth function that is defined on a neighborhood of $p$ (and in particular, global functions).