For convenience let $\omega\in \Omega_X^1$ be a 1-form on a manifold $X$. The perfect pairing given by the determinant gives a canonical bundle isomorphism $\Lambda ^k\mathrm (\mathrm T^\vee X)\cong (\Lambda^k(\mathrm TX))^\vee$, so we may think of $\omega$ as a smooth choice of functional on each tangent space.
Let $v_1,v_2\in \mathrm T_p(X)$. If I understand correctly, the coordinate free formula for the exterior derivative (of 1-forms) is as follows.
Let $\vec{v}_1,\vec{v}_2$ be vector fields on $X$ satisfying $\vec{v}_i(p)=v_i$. Write $\mathrm {Fl}_{\vec{v}_i}(t,x)$ for the flow along $\vec{v}_i$. Then the exterior derivative $\mathrm d\omega$, whose value at a point $p\in X$ is a linear functional on the second exterior power of the tangent space at $p$, is given by the following formula.
$$\mathrm d\omega(p)(v_1\wedge v_2)=\left.\tfrac{\mathrm d}{\mathrm d t}\right|_{0}(\omega\circ \vec{v}_2\circ \mathrm {Fl}_{\vec{v}_1}(t,p))-\left.\tfrac{\mathrm d}{\mathrm d t}\right|_{0}(\omega\circ \vec{v}_1\circ \mathrm {Fl}_{\vec{v}_2}(t,p)) -\omega(p)([\vec{v}_1,\vec{v}_2](p))$$
The functional $\mathrm d\omega (p)$ by definition eats parallelograms in $\mathrm T_pX$. On the other hand, the right hand side involves a choice of "extensions" $\vec{v}_i$ which live in the tangent bundle. Moreover, the value of each summand on the RHS of the above formula seems highly dependent on the choice of $\vec{v}_i$.
Question. Why is $\mathrm d\omega (p)(v_1\wedge v_2)$ independent of the choice of $\vec{v}_i$?
Remark. I am not bothered by the dependence of the functional $\mathrm d\omega(p)$ on the behavior of $\omega$ locally about $p$. It makes sense geometrically. I am confused as to why the result should be independent of the $\vec{v}_i$.
Added. I am familiar with the proof by reduction to reasonable calculation via linearity over smooth functions, but I am not able to convert it into an explanation of independence on the $\vec{v}_i$.
Roughly speaking, the extension of $v_1$ is "like" $v_1$ at $p$ to 0th order, and that's all that you need.
Suppose, in a far simpler example consisting of functions from the reals to the reals, you have $$ h(x) = f(g_1(x)) \\ k(x) = f(g_2(x)) $$ where $g_1(a) = g_2(a)$ and $g_1'(a) = g_2'(a)$. How are the derivatives of $h$ and $k$ related? Well, the chain rule tells you that they're the same, because $g_1$ and $g_2$ agree to first order.
Why is 0th order enough in the exterior derivative computation? I suspect it's because you should be thinking of $v_1$ and $v_2$ (or their extensions, with the arrows on top) as tangents to some flow. By adjusting parameters, the flows can be made to both pass through $p$ at the same $(s, t)$ coordinates (where $t$ is the "along the flow" coordinate, and $s$ is the "starting point" coordinate)...in particular, we can choose parameters so that they both pass through $p$ at coordinates $(p, 0)$. The two time-derivative terms in the formula represent $\omega's$ sensitivity to flows in those directions (or really, how "omega's response in the $v_1$ direction changes as you move in the $v_2$ direction"). These two flows (after adjustment) agree to first order at $p$, so results involving derivatives agree (almost), essentially by the chain rule.
The problem is, as you observe, that these first two derivatives in your formula depend on the extensions you choose for $v_1$ and $v_2$. But they depend in a fairly well-organized way, and it turns out (through computations that you don't seem to like) that the bracket term is exactly what's needed to cancel out these dependencies.
In Bishop and Crittenden's Geometry of Manifolds, I think that the bracket is defined by this formula rather than the other way around. That makes a little more sense to me: you can think about walking around a parallelogram in the manifold by starting at $p$, moving along $v_1$ for time $h$, then $v_2$ for time $h$, then $-v_1$ for time $h$m and then $-v_2$ for time $h$ (where all of these should really be the extensions, but I hate typing over-arrows). This parallelogram generally doesn't actually "close up", but you can show that as $h$ gets small, so does the distance from the "fourth corner" of the parallelogram from the starting corner; it in fact goes to zero quadratically as a function of $h$ if I'm remembering correctly. If you look at a limit of $(F(h) - p)/h^2$, where $F(h)$ indicates where the last point of the "parallelogram" with edge $h$ ends up, and the "difference" (which doesn't make literal sense) is computed in $T_p M$ by lifting $F(h)$ into the tangent space via the exponential map, for instance, then that limit (up to a constant) is the Lie bracket.