Wikipedia states:
Since any vector $v$ is a linear combination $\sum v_je_j$ of its components, $df$ is uniquely determined by $df_p(e_j)$ for each $j$ and each $p \in U$, which are just the partial derivatives of $f$ on $U$. Thus $df$ provides a way of encoding the partial derivatives of $f$. It can be decoded by noticing that the coordinates $x_1, x_2, \ldots, x_n$ are themselves functions on $U$, and so define differential 1-forms $dx_1, dx_2, \ldots, dx_n$. Since $\frac{\partial x_i}{\partial{x_j}} = \delta_{ij}$, the Kronecker delta function, it follows that
$$df = \sum_{i=1}^n \frac{\partial f}{\partial x^i} \, dx^i$$
Can anyone elaborate on this? In particular, how do we know what the components of $df$ are? What's the relationship between $x = (x_1, x_2, \ldots, x_n)$ and $p$? Where does the Kronecker function come in?
Let $A$ be open in $\Bbb{R}^{n}$; let $f: A \to \Bbb{R}$ be of class $C^{1}$; for every $x \in A$ let $df^{x}: v \mapsto \nabla f(x)\cdot v$ on $\Bbb{R}^{n}$. Then $df$, the differential of $f$, is a $1$-form on $A$ (provided that a $1$-form is defined as an alternating $1$-tensor).
Each elementary 1-form $dx_{i}$, which is the differential of the $i$th projection map on $\Bbb{R}^{n}$, is thus the map that assigns to every $x \in A$ the map $dx_{i}^{x}: v \mapsto \nabla x_{i}(x)\cdot v = v_{i}$ on $\Bbb{R}^{n}.$
If $x \in A$ and if $v \in \Bbb{R}^{n}$, then $$ df^{x}(v) = \nabla f(x)\cdot v = \sum_{i=1}^{n}D_{i}f(x)v_{i} = \sum_{i=1}^{n}D_{i}f(x)dx_{i}^{x}(v); $$ this result can be written succinctly as $$ df = \sum_{i=1}^{n}(D_{i}f) dx_{i}. $$
Note that it is a (conventional) "abuse" of notation to write the $i$th projection map as $x_{i}$; if we denote it by $p_{i}$, say, then it is clear that $$ \frac{\partial p_{i}}{\partial x_{j}} = \frac{\partial x_{i}}{\partial x_{j}} = \delta_{ij}, $$ which simply says that $D_{j}x_{i} = 1$ if $i=j$ and $=0$ if $i \neq j$.