Here's my dilemma.
We know, that $\partial_i$ forms a basis of vector fields and $\mathrm{d} x^i$ forms a basis of co-vector fields, so that we can write, in general (vector and covector)
$$ v = v^i \partial_i \quad \text{and} \quad \alpha = \alpha_i \mathrm{d} x^i $$ and this is valid in any coordinate system. However, this is where it gets a bit confusing, because for example, in polar coordinates (in whatever follows, $(e_x = \partial_x, e_y = \partial_y)$ is an orthonormal vector basis), $$ \partial_r = \cos \theta \, \partial_x + \sin \theta \, \partial_y \quad \text{and} \quad \partial_\theta = - r \sin \theta \, \partial_x + r \cos \theta \, \partial_y $$ and we can see that the basis $\partial_r, \partial_\theta$ is not orthonormal, just orthogonal. Moreover, by common understanding, vector $(0,1)$ in polar coordinates we mean $\vec{e}_\theta$, that is, $\frac{1}{r} \partial_\theta = -\sin \theta \, \vec{e}_x + \cos \theta \, \vec{e}_y$ and not $\partial_\theta$, which would be $-r \sin \theta \, \vec{e}_x + r \cos \theta \, \vec{e}_y$.
Similar problem happens with basis one-forms $$ \mathrm{d} r = \cos \theta \, \mathrm{d} x + \sin \theta \, \mathrm{d} y \quad \text{and} \quad \mathrm{d} \theta = - \frac{1}{r} \sin \theta \, \mathrm{d} x + \frac{1}{r} \cos \theta \, \mathrm{d} y $$
However this time, people for some reason do not resort to using normalized co-vector basis $e^r, e^\theta$, which would be $$ e^r = \mathrm{d} r = \cos \theta \, \mathrm{d} x + \sin \theta \, \mathrm{d} y \quad \text{and} \quad e^\theta = r \, \mathrm{d} \theta = - \sin \theta \, \mathrm{d} x + \cos \theta \, \mathrm{d} y $$
If we did that, we would find out that naturally arising metric tensors would always be written as $$ g = \mathrm{d} x \otimes \mathrm{d} x + \mathrm{d} y \otimes \mathrm{d} y = \mathrm{d} r \otimes \mathrm{d} r + r^2 \mathrm{d} \theta \otimes \mathrm{d} \theta = e^r \otimes e^r + e^\theta \otimes e^\theta $$ because $e^\theta$ already contains the normalizing factor of $r$.
So I see somewhat of a discrepancy here. While with vectors people take special care to really use normalized basis $e_r, e_\theta$, so that the vector $(0, 1)$ really means $0 \times e_r + 1 \times e_\theta = -\sin \theta \, \vec{e}_x + \cos \theta \, \vec{e}_y$ and not $0 \times \partial_r + 1 \times \partial_\theta = -r \sin \theta \, \vec{e}_x + r \cos \theta \, \vec{e}_y$, but with co-vectors, or objects containing outer product of co-vectors we throw this caution out of the window and use the basis stemming from coordinates. If we therefore write, that a certain one-form $\alpha$ has components $(1, 2)$ in polar coordinates, we mean $\alpha = \mathrm{d} r + 2 \mathrm{d} \theta$, and not $\alpha = e^r + 2 e^\theta = \mathrm{d} r + 2 r \mathrm{d} \theta$.
Another example of this, if we go back to the vector basis, the inverse of metric tensor is usually written in components as $$ g^{i j} = \begin{pmatrix} 1 & 0 \\ 0 & r^{-2} \end{pmatrix} $$ which means that we are taking basis $(\partial_r, \partial_\theta) \otimes (\partial_r, \partial_\theta)$ again, and not normalized $(e_r, e_\theta) \otimes (e_r, e_\theta)$, because $$ g^{-1} = \partial_r \otimes \partial_r + \frac{1}{r^2} \partial_\theta \otimes \partial_\theta $$ however, if we used normalized $e_r, e_\theta$, this would simply be $$ g^{-1} = e_r \otimes e_r + e_\theta \otimes e_\theta $$ so the components would form a unit matrix, instead of $\text{diag} (1, r^{-2})$. So for some reason, while with vectors themselves we take special care to use $(e_r, e_\theta)$ as our basis, with the inverse metric tensor, we don't use the outer product of these basis vectors, we use $\partial_r, \partial_\theta$ instead.
Questions: why is the basis used with vectors $(e_r, e_\theta)$, but the basis used with tensor fields $(2, 0)$ is $(\partial_r, \partial_\theta) \otimes (\partial_r, \partial_\theta)$? What basis do we use with mixed $(1,1)$ tensor fields? would it be $(\partial_r, \partial_\theta) \otimes (\mathrm{d} r, \mathrm{d} \theta)$ or $(\partial_r, \partial_\theta) \otimes (e^r, e^\theta)$, or $(e_r, e_\theta) \otimes (e^r, e^\theta)$? Why do we use this special basis $(e_r, e_\theta)$ just with vector fields but with anything more complicated we use outer products of $(\partial_r, \partial_\theta)$ and $(\mathrm{d} r, \mathrm{d} \theta)$?
Polar coordinates are just an example, the same problem arises in e.g. spherical coordinates.